Entries for tag "gpu", ordered from most recent. Entry count: 9.
# Vulkan API - my talk at Warsaw University of Technology
On Wednesday 16 April, around 8 PM, at Warsaw University of Technology, during weekly meeting of KNTG Polygon, I will give a talk about "Vulkan API" (in Polish). Come if you want to hear about new generation of graphics APIs, see how Vulkan API looks like, what tools are there to support it, what are advantages and disadvantages of using such API and finally decide whethere learning Vulkan is a good idea for you.
Event on Facebook: https://www.facebook.com/events/185314825611839/
# When integrated graphics works better
In RPG games the more powerful your character is, the more tough and scary are the monsters you have to fight. I sometimes get a feeling that the same applies to real life - bugs you meet when you are a programmer. I recently blogged about the issue when QueryPerformanceCounter call takes long time. I've just met another weird problem. Here is my story:
I have Lenovo IdeaPad G50-80 (80E502ENPB) laptop. It has switchable graphics: integrated Intel i7-5500U and dedicated AMD Radeon R5 M330. Of course I used to choose AMD dedicated graphics, because it's more powerful. My application is a music visualization program. It renders graphics using Direct3D 11. It uses one
ID3D11Device object and one thread for rendering, but two windows displayed on two outputs: output 1 (laptop screen) contains window with GUI and preview, while output 2 (projector connected via VGA or HDMI) shows main view using borderless, topmost window covering whole screen (but not real fullscreen as in
IDXGISwapChain::SetFullscreenState). I tend to enable V-sync on output 1 (
IDXGISwapChain::Present SyncInterval = 1) and disable it on output 0 (
SyncInterval = 0). My rendering algorithm looks like this:
Loop over frames: Render scene to MainRenderTarget Render MainRenderTarget to OutputBackBuffer, covering whole screen Render MainRenderTarget to PreviewBackBuffer, on a quad Render ImGui to PreviewBackBuffer OutputSwapChain->Present() PreviewSwapChain->Present()
So far I had just one problem with it: my framerate decreased over time. It used to drop very quickly after launching the app from 60 to 30 FPS and stabilize there, but after few hours it was steadily decreasing to 20 FPS or even less. I couldn't identify the reason for it in my code, like a memory leak. It seemed to be related to rendering. I could somehow live with this issue - low framerate was not that noticable.
Suddenly this Thursday, when I wanted to test new version of the program, I realized it hangs after around a minute from launching. It was a strange situation in which the app seemed to be running normally, but it was just not rendering any new frames. I could see it still works by inspecting CPU usage and thread list with Process Hacker. I could minimize its windows or cover them by other windows and they preserved their content after restoring. I even captured trace in GPUView, only to notice that the app is filling DirectX command queue and AMD GPU is working. Still, nothing was rendered.
That was a frightening situation for me, because I need to have it working for this weekend. After I checked that restarting app or the whole system doesn't help, I tried to identify the cause and fix it in various ways:
1. I thought that maybe there is just some bug in the new version of my program, so I launched the previous version - one that successfully worked before, reaching more than 10 hours of uptime. Unfortunately, the problem still occured.
2. I thought that maybe it's a bug in the new AMD graphics driver, so I downloaded and installed previous version, performing "Clean install". It didn't help either.
3. In desperation, I formatted whole hard drive and reinstalled operating system. I planned to to it anyway, because it was a 3-year-old system, upgraded from Windows 8 and I had some other problems with it (that I don't describe here because they were unrelated to graphics). I installed the latest, clean Windows 10 with latest updates and all the drivers. Even that didn't solve my problem. The program still hung soon after every launch.
I finally came up with an idea to switch my app to using Intel integrated graphics. It can be done in Radeon Settings > "Switchable Graphics" tab. In a popup menu for a specific executable, "High Performance" means choosing dedicated AMD GPU and "Power Saving" means choosing integrated Intel GPU. See article Configuring Laptop Switchable Graphics... for details.
It solved my problem! The program not only doesn't hang any longer, but it also maintains stable 60 FPS now (at least it did during my 2h test). Framerate drops only when there is a scene that blends many layers together on a FullHD output - apparently this GPU cannot keep up with drawing so many pixels per second. Anyway, this is the situation where using integrated Intel graphics turns out work better than a faster, dedicated GPU.
I still don't know what is the cause of this strange bug. Is it something in the way my app uses D3D11? Or is it a bug in graphics driver (one of the two I need to have installed)? I'd like to investigate it further when I find some time. For now, I tend to believe that:
- The only thing that might have changed recently and break my app was some Windows updated pushed by Microsoft.
- The two issues: the one that I had before with framerate decreasing over time and the new one with total image freeze are related. They may have something to do with switchable graphics - having two different GPUs in the system, both enabled at the same time. I suspect that maybe when I want to use Radeon, the outputs (or one of them) are connected to Intel anyway, so the image needs to be copied and synchronized with Intel driver.
Update 2018-02-21: Later after I published this post, I tried few other things to fix the problem. For example, I updated AMD graphics driver to latest version 18.2.2. It didn't help. Suddently, the problem disappeared as mysteriously as it appeared. It happened during a single system launch, without a restart. My application was hunging, and later it started working properly. The only thing that I can remember doing in between was downloading and launching UIforETW - a GUI tool for capturing Event Tracing for Windows (ETW) traces, like the ones for GPUView. I know that it automatically installs GPUView and other necessary tools on first launch, so that may have changed something in my system. Either way, now my program works on AMD graphics without a hang, reaching few hours of uptime and maintaining 60 FPS, which only sometimes drops to 30 FPS, but it also go back up.
# Rendering Optimization - My Talk at Warsaw University of Technology
If you happen to be in Warsaw tomorrow (2017-12-13), I'd like to invite you to my talk at Warsaw University of Technology. On the weekly meeting of Polygon group, this time the first talk will be about about artificial intelligence in action games (by Kacpi), followed by mine about rendering optimization. It will be technical, but I think it should be quite easy to understand. I won't show a single line of code. I will just give some tips for getting good performance when rendering 3D graphics on modern GPUs. I will also show some tools that can help with performance profiling. It will be all in Polish. The event starts at 7 p.m. Entrance is free. See also Facebook event. Traditionally after the talks we all go for a beer :)
# Pixel Heaven and Bajtek Special Issue
Do you remember "Bajtek" magazine? I don't, because I was a little kid back then, but older colleagues told me that in 80's and 90's it was a popular Polish magazine about computers (like Atari, Commodore or Amiga - platforms that were in use at that time). Archival issues can be downloaded for free from atarionline.pl.
Now, 20 years after last one, a new issue has been released. It's a single, special issue - Wydanie specjalne: Bajtek. There is my article inside - "Programowanie grafiki dzi¶" ("Graphics Programming Today"). The article describes briefly a history of graphics cards (from first 3D games, through 3Dfx Voodoo and S3 ViRGE, cards from NVIDIA and ATI/AMD, appearance of OpenGL and DirectX, to invention of shaders), shows graphics pipeline of modern GPU-s and mentions the new generation of graphics API-s (Direct3D 12 and Vulkan).
Many people who were interested in graphics programming, games or demoscene at the time of Bajtek magazine, now have a more "serious" job, whether in software development or something completely different, and they no longer have time for this hobby, so they are not up-to-date with advancements in this technology. So I thought they may like a short update on this subject.
The new issue of Bajtek was first shown on Pixel Heaven - a party that took place 3-5 June 2016 in Warsaw. I've been there and I had a great time. There were many different activities, like indie games exhibition, retro gaming zone, lectures and discussion panels.
# Vulkan 1.0 Released!
Yesterday (2016-02-16) was a big day - Vulkan 1.0 has finally been released. The new 3D graphics and compute API from Khronos Group has a chance to be the solution long awaited in the PC world that will:
Time will tell whether Vulkan becomes popular, common standard. It's not so certain. Microsoft promotes its own Direct3D 12, Apple has its Metal API, NVIDIA develops CUDA, old OpenGL and OpenCL are here to stay. What hardware versions and software platforms will eventually support the new API? What will be the quality and performance of those drivers? Will some good debugging and performance probiling tools become available? Will game developers and game engine developers port their code any time soon? What the reception will be among video/media, CAD/CAM, HPC professionals? I'm very enthusiastic, seeing so many learning materials and code samples available since day one! Just look at #Vulkan and #VulkanAPI hashtags on Twitter.
Some useful links to start with:
# Lower-Level Graphics API - What Does It Mean?
They say that the new, upcoming generation of graphics API-s (like DirectX 12 and Vulkan) will be lower-level, closer to the GPU. You may wonder what does it exactly mean or what is the purpose of it? Let me explain that with a picture that I have made few months ago and already shown on my two presentations.
Row 1: Back in the early days of computer graphics (like on Atari, Commodore 64), there were only applications (green rectangle), communicating directly with graphics hardware (e.g. by setting hardware registers).
Row 2: Hardware and software became more complicated. Operating systems started to separate applications from direct access to hardware. To make applications working on variety of devices available on the market, some standards had to be defined. Device drivers appeared as a separate layer (red rectangle).
Graphics API (Application Programming Interface), like every interface, is just the means of communication - standardized, documented definition of functions and other stuff that is used on the application's side and implemented by the driver. Driver translates these calls to commands specific to particular hardware.
Row 3: As games became more complex, it was no longer convenient to call graphics API directly from game logic code. Another layer appeared, called game engine (yellow rectangle). It is essentially a comprehensive library that provides some higher-level objects (like an entity, asset, material, camera, light) and implements them (in its graphical part) using lower-level commands of graphics API (like mesh, texture, shader).
Row 4: This is where we are now. Games, as well as game engines constantly become more complex and expensive to make. Less and less game development studios make their own engine technology, more prefer to use existing, universal engines (like Unity, Unreal Engine) and just focus on gameplay. These engines recently became available for free and on very attractive licenses, so this trend affects both AAA, as well as indie and amateur game developers.
Graphics drivers became incredibly complex programs as well. You may not see it directly, but just take a look at the size of their installers. They are not games - they don't contain tons of graphics and music assets. So guess what is inside? That is a lot of code! They have to implement all API-s (DirectX 9, 10, 11, OpenGL). In addition to that, these API-s have to backward compatible and not necessarily reflect how modern GPU-s work, so additional logic needed for that can introduce some performance overhead or contain some bugs.
Row 5: The future, with new generation of graphics API-s. Note that the sum width of the bars is not smaller than in the previous row. (Maybe it should be a bit smaller - see comment below.) That is because according to the concept of accidental complexity and essential complexity from famous book No Silver Bullet, stuff that is really necessary has to be done somewhere anyway. So lower-level API means just that driver could be smaller and simpler, while upper layers will have more responsibility of manually managing stuff instead of automatic facilities provided by the driver (for example, there is no more DISCARD or NOOVERWRITE flag when mapping a resource in DirectX 12). It also means API is again closer to the actual hardware. Thanks to all that, the usage of GPU can be optimized better by knowing all higher-level details about specific application on the engine level.
Question is: Will that make graphics programming more difficult? Yes, it will, but these days it will affect mostly a small group of programmers working directly on game engines or just passionate about this stuff (like myself) and not the rest of game developers. Similarly, there may be a concern about potential fragmentation. Time will show which API-s will be more successful than the others, but in case none of them will become standard across all platforms (Vulkan is a good candidate) and GPU/OS vendors succeed in convincing developers to use their platform-specific ones, it will also complicate life only for these engine developers. Successful games have to be multiplatform anyway and modern game engines do good job in hiding many of differences between platforms, so they can do the same with graphics.
# Lectures on ETI, Gdańsk University of Technology
Employees of Intel Technology Poland are visiting Gdańsk University of Technology, Faculty of Electronics, Telecommunications and Informatics (known as ETI). On Thursday - 8, 15, 22 January 2015, there will be lectures as part of "Computer Graphics" course. Time: 11:15 - 13:00, place: new ETI building, room NE AUD1L. It's a lecture for students of computer science, but anyone who is interested can come and listen.
Together with Piotr Kozioł, I will be presenting on January 22nd. Our presentation has title "Shaders and their compilation" and will cover:
During 2 hours we will cover lots of topics - basically all what happens to the shader after it's written in high level language and passed to graphics API - how it's processed by the driver and executed by the GPU.
# What do we have from benchmarks?
There was this case some time ago about some graphics vendors cheating in Futuremark benchmark (see this). They basically detected this particular application and raised frequency to increase performance and gain higher score. So some devices have been delisted from the Best Mobile Devices list for cheating and they published this document: Benchmark Rules and Guidelines.
My first thought was: Good, they just want everyone to play fair. But then I read the rules again, especially this one: "The platform may not replace or remove any portion of the requested work even if the change would result in the same output." and I said: Wait, what? Isn't it a generic definition of every optimization? If a developer writes 2+2 in GLSL and the platform just uses 4, is it cheating because it removed requested work (addition in this case) even if result is the same?
And then I started thinking: What do we have from benchmarks after all? Is their importance a good thing for gamers and other customers of graphics technology? In theory, benchmarks should mimick some aspect of real applications to measure and compare how different hardware performs in this type of applications (e.g. games). But it may be that decision makers want to just see good scores in benchmarks (bosses generally like numbers and bars and graphs :) so engineers implement optimizations or even some cheats just for these benchmarks. And then media notice that, devices get delisted, benchmark creators write such rules... and gamers just want to play games.
If performance was measured just in real games, and platform vendors optimized or even cheated for a particular title, then at least we would have a better performing game. Just my personal opinion :)