×

News

Professor Jangwoo Kim, pioneering ‘Device-centric computer system technology’ (Insight, 20190301)

March 20, 2019l Hit 611

Developing a computer engine that can handle overflowing Big Data


Associate professor of SNU Department of Electrical and Computer Engineering Jangwoo Kim

 

Prior to the development of promising fields in IT such as artificial intelligence, autonomous vehicles, and supercomputers, there is something in common that should be preceded. As a matter of fact, it is high speed processing of big data. High speed Graphic Processing Units (GPU) are essential to train artificial intelligence, and multiple Neural Processing Units(NPU) should be simultaneously supported for autonomous vehicles to handle computations for artificial intelligence. It is also essential to provide supercomputers with high speed network devices.


Therefore, since the CPU governs these devices such as GPU, NPU, and network, its performance is important to process big data faster.

However, since it is difficult to mount an additional CPU in current computer systems, the CPU confronted a problem of not being able to handle high-performance devices.  

To solve this problem, the SNU research team created a separate device engine that can serve the purposes of CPU instead. The so-called ‘device-centric computer system technology for fast big data processing’ was developed, increasing the performance more than 10 times.

The research team did not stop at developing the technology and got to the level of making real system production possible, gaining attention from the academia and industries. Also, multiple researchers who contributed to this research were recognized for their results by global companies including Google, Microsoft, and Intel. Some were selected as headquarter interns or offered job positions immediately after graduation. We met Jangwoo Kim, associate professor of SNU, who was the director of this research, on last February 25th at the engineering building of SNU Gwanak campus located in Sillim-dong, Seoul.

- What is ‘device-centric computer system technology for big data fast processing’

“‘Device-centric computer system’ is a novel computer system architecture developed by our lab to minimize accessing peripherals through operations of the CPU’s operating system, which was a problem in conventional systems. To explain it in a simple way, let’s call the peripherals like memory devices, network devices, and computational assistance devices, mounted in current computer systems as devices. The current system requires an additional high-performance CPU as the performance and number of these devices increase. However, CPU is a resource that cannot be additionally mounted, so current systems suffer from a fundamental limitation of not being able to handle multiple high-performance devices. The device-centric computer system we created solves the problem by mounting a separate hardware device that takes care of the part in the operating system of a CPU for accessing devices to free the CPU from this role. I will call this developed device ‘DCS engine.’ With this DCS engine mounted, when the user program tries to use the other devices situated in the system, it queries the DCS engine without having to go through the CPU. The DCS engine allows fast and direct access to target devices and direct communication between devices. As a result, the device-centric computer system can dramatically increase both the overall performance of the system and the scalability of devices by allowing the CPU to concentrate on operating user programs without being hindered by additionally mounted high-performance devices or an improvement in device performance.”

- How does the conventional system work?

 “The overall performance of the current computer system depends on the number and ability of CPUs in the system. Let’s say that a conventional computer system is running a database program. In this case, while the CPU in a conventional computer system busily runs the database program, when it needs to read or print data to run the program, the CPU has to stop the program it was working on for a moment to access the corresponding device. For example, such cases could include reading data from the memory device for the database program or using the network device to send the resulting data to another computer. The conventional computer system has two properties in this kind of device access. First, the operating system, which is a program that takes care of the computer system, is used to access devices so there should be a transition between user programs and the operating system program. Second, CPU is used both for this transitioning and for running the operating system. Until now, this was not a big problem for current computer systems since there were not many cases when user programs like database had to access devices. Even when they had to, the physically accessing time was so slow that the speed of the CPU’s operating system transition and running had minimal effect on the actual system performance.”

- What are the limitations of the conventional systems for big data processing?

 “For Big Data processing, conventional systems have a fundamental problem of not being able to support high-speed devices. In Big Data computing, the computer system needs frequent access to a massive amount of data to process Big Data. Also, the physical access speed of devices became very fast with high-performance memory devices like SSD and fast network devices such as 100Gbps NIC, which allowed CPUs to get frequent access to these devices. As a result, the time the CPU spends to run its operating system is what decides the performance of the overall system. Therefore, the performance degradation due to the CPUs using its operating system to access devices became a critical problem. The performance and number of CPUs in a system cannot be increased, so not being able to add additional high-speed devices above a certain level has also become a problem.”

- What research were you focusing on?
“I’m currently doing research in the field of computer system architecture. In a nutshell, we set a target goal, develop a system architecture for the current system to optimally run the workload, and show the result by implementing it into an actual physical system. For example, the device-centric computer system is an optimal system for high-performance Big Data processing workloads with frequent access to devices. In addition to this, some of the topics our lab is actively working on includes performance analysis and optimization of large-scale systems, high performance storage system development, artificial intelligence accelerator architecture development, brain-inspired neuromorphic computer system development, and computer systems working in cryogenic temperature.”

- What motivated you to conduct research in system architecture?
“What motivated me to develop the device-centric computer system was the needs from the academia and industries asking for a solution to this problem, and that I knew that I had the professional knowledge in this matter to come up with a solution. To develop this kind of a system, a deep understanding in computer architecture, operating system, and peripheral devices as background knowledge is needed with an expert-level knowledge in actual system hardware and software implementation. However, there are not many research labs world-wide that satisfies both. In our lab, we had the ability to develop this kind of a system and the obligation to develop an actual system that works well to contribute to the field. This motivated us to conduct the research.

- How far has the technology development progressed?
“The fundamental technology enabling the device-centric computer system is ‘DCS engine’, which is currently in a form of a programmable hardware device (FPGA). Software is also available which allows the operating system to recognize the device when its mounted. The prototype is fully developed and it can be mass produced whenever to meet the needs of industries.”

Internal architecture of single node Device-Centric Server (DCS)

 

- What are the improvements compared to the conventional system?

“When the DCS engine device we developed is mounted, the current computer system will transform into a device-centric computer system. Compared to the conventional system, this allows the overall performance of the system to increase linearly with additional high-speed devices integrated into the system. To compare our technology to those of others, its degree of performance enhancement is larger. Ours also has the advantage of generality, allowing most devices to be mounted on the system when other technology only allows certain devices. The scalability of the actual data center workload compared to the price is over 10 times higher than the conventional systems.”

- What meaning does 10 times higher have?

“When trying to improve the performance of the conventional system and there is a lack of devices, not only the required devices should be bought but also an additional computer server that has these devices mounted. For example, even when only a memory device is needed additionally, since there is no performance enhancement with only mounting the memory device due to the problem of CPU’s operating system, a complete server with the memory device is needed. Compared to this, as our Device-Centric System allows only the needed devices to be additionally integrated into the system, the performance increase compared to the price is so much better. Also, the physical space the total system takes up and the power consumption compared to the performance significantly decreases.”

 

- What kind of research was preceded for this technology?

“Our laboratory has been working on various type of high-performance computer systems until now. Among the preceded work, CPU performance analysis and design, storage system analysis and design, operating system optimization, and prototyping based on FPGA were directly useful in developing the Device-Centric Computer System.”

 

- What enabled the development?

“It was the broad theoretical knowledge and experience in development for system development of the research team, and the urge for the development. The research funding provided by Samsung Future Foundation for a long stable time was also of great help. The research results getting accepted to top conferences in the field of computer architecture such as ISCA and MICRO also accelerated the research progress.”

 

- Global companies like Google and Intel recognized the results. Which parts were highly regarded?

“To develop a fully working system is much harder than presenting a paper focusing on ideas. It is because multiple competitive researchers have to concentrate on a single system development for a long period of time. As a matter of fact, actual system development may seem unnecessary for a university-based research lab, where paper results are more important. However, research results are a lot more persuasive when an actual working system is developed. The significance of the problem we tried to solve was well-known to companies like Google and Intel, our research results were accepted to top conferences such as ISCA and MICRO as I mentioned before, and the system was actually developed, which allowed us to receive high recognition.”

 

- What effects will this technology bring to us as one of the 100 future technologies?

“The field of Big Data processing, which was chosen as one of the 100 technologies, will have its strength in supporting high-speed memory devices. We also expect this technology to have advantage in developing computer systems for training artificial intelligence, which handles multiple high-speed GPUs, and in the field of autonomous driving cars, which has to support multiple heterogeneous NPU devices. Also, it can be used in the field of supercomputer development which needs to support high-speed network devices. In our research lab, we are actually working on follow-up research. We expect technology related to our device-centric computer system to play an important role in developing various high-performance computer systems in the future as it is already actively applied to current developing computers.”

Source: http://ee.snu.ac.kr/community/news?bm=v&bbsidx=48560
Translated by Kyungjin Lee, English Editor of Department of Electrical and Computer Engineering, jin11542@snu.ac.kr