Saya sedang mengerjakan game PC berbasis ubin / sprite kecil dengan sekelompok orang, dan kami mengalami masalah kinerja. Terakhir kali saya menggunakan OpenGL adalah sekitar tahun 2004, jadi saya telah belajar sendiri bagaimana menggunakan profil inti, dan saya agak bingung.
Saya perlu menggambar di lingkungan 250-750 48x48 ubin ke layar setiap frame, serta mungkin sekitar 50 sprite. Ubin hanya berubah ketika level baru dimuat, dan sprite berubah sepanjang waktu. Beberapa ubin terdiri dari empat potong 24x24, dan sebagian besar (tetapi tidak semua) sprite berukuran sama dengan ubin. Banyak ubin dan sprite menggunakan alpha blending.
Saat ini saya sedang melakukan semua ini dalam mode langsung, yang saya tahu adalah ide yang buruk. Semua sama, ketika salah satu anggota tim kami mencoba untuk menjalankannya, ia mendapat frame rate yang sangat buruk (~ 20-30 fps), dan jauh lebih buruk ketika ada lebih banyak ubin, terutama ketika banyak ubin itu adalah jenis yang dipotong-potong. Ini semua membuat saya berpikir bahwa masalahnya adalah jumlah panggilan yang dibuat.
Saya telah memikirkan beberapa solusi yang mungkin untuk ini, tetapi saya ingin menjalankannya oleh beberapa orang yang tahu apa yang mereka bicarakan sehingga saya tidak membuang waktu saya untuk sesuatu yang bodoh:
TILES:
- Ketika sebuah level dimuat, gambarkan semua ubin sekali ke dalam bingkai penyangga yang melekat pada tekstur klakson besar, dan cukup gambar sebuah persegi panjang besar dengan tekstur itu di setiap frame.
- Masukkan semua ubin ke dalam buffer vertex statis ketika level dimuat, dan gambarkan dengan cara itu. Saya tidak tahu apakah ada cara untuk menggambar objek dengan tekstur berbeda dengan satu panggilan ke glDrawElements, atau apakah ini bahkan sesuatu yang ingin saya lakukan. Mungkin hanya menempatkan semua ubin ke dalam tekstur raksasa besar dan menggunakan koordinat tekstur lucu di VBO?
SPRIT:
- Gambarlah setiap sprite dengan panggilan terpisah ke glDrawElements. Ini sepertinya melibatkan banyak perpindahan tekstur, yang menurut saya buruk. Apakah array tekstur mungkin berguna di sini?
- Bagaimanapun juga, gunakan VBO dinamis. Pertanyaan tekstur yang sama seperti nomor 2 di atas.
- Sprite titik? Ini mungkin konyol.
Apakah ada dari ide-ide ini yang masuk akal? Apakah ada implementasi yang baik di suatu tempat saya bisa melihat?
Jawaban:
Cara tercepat untuk merender ubin adalah mengemas data verteks ke dalam VBO statis dengan indeks (seperti yang ditunjukkan glDrawElements). Menulisnya ke gambar lain sama sekali tidak perlu dan hanya akan membutuhkan lebih banyak memori. Pergantian tekstur SANGAT mahal, jadi Anda mungkin ingin mengemas semua ubin menjadi apa yang disebut Tekstur Atlas dan memberikan setiap segitiga dalam VBO koordinat tekstur yang tepat. Berdasarkan ini, seharusnya tidak menjadi masalah untuk membuat 1000, bahkan 100000 ubin, tergantung pada perangkat keras Anda.
Satu-satunya perbedaan antara rendering Tile dan rendering Sprite mungkin bahwa sprite bersifat dinamis. Jadi untuk kinerja terbaik, namun mudah dicapai, Anda bisa meletakkan koordinat untuk simpul sprite ke dalam aliran, menggambar VBO setiap frame dan menggambar dengan glDrawElements. Kemas juga semua tekstur dalam Atlas Tekstur. Jika sprite Anda jarang bergerak, Anda juga bisa mencoba membuat VBO yang dinamis dan memperbaruinya ketika sprite bergerak, tetapi itu total berlebihan di sini, karena Anda hanya ingin merender beberapa sprite.
Anda dapat melihat prototipe kecil yang saya buat di C ++ dengan OpenGL: Particulate
Saya membuat sekitar 10.000 sprite titik saya kira, dengan rata-rata fps 400 pada mesin biasa (Quad Core @ 2.66GHz). Ini CPU tertutup, itu berarti bahwa kartu grafis dapat membuat lebih banyak. Perhatikan bahwa saya tidak menggunakan Texture Atlases di sini, karena saya hanya memiliki tekstur tunggal untuk partikel. Partikel-partikel diberikan dengan GL_POINTS dan shader menghitung ukuran quad sebenarnya, tapi saya pikir ada juga Quad Renderer.
Oh, dan ya, kecuali Anda memiliki kotak dan menggunakan shader untuk pemetaan tekstur, GL_POINTS cukup konyol. ;)
sumber
Even with this number of draw calls you shouldn't be seeing that kind of performance drop - immediate mode may be slow but it's not that slow (for reference, even dear-old Quake can manage several thousand immediate-mode calls per frame without falling down so badly).
I suspect that there is something more interesting going on here. The first thing you need to do is invest some time in profiling your program, otherwise you stand a huge risk of rearchitecting based on an assumption that may result in zero performance gain. So run it through even something as basic as GLIntercept and see where your time is going. Based on the results of that you'll be able to tackle the problem with some real info about what your primary bottleneck(s) is/are.
sumber
Okay, since my last answer kinda got out of hands here is a new one wich is maybe more useful.
About 2D-Performance
First some general advice: 2D isn't demanding for current hardware, even largely unoptimized code will work. That doesn't mean you should Intermediate Mode though, atleast make sure you don't change states when unnecessary (for example don't bind a new texture with glBindTexture when the same texture is already bound, a if check on the CPU is tons faster than a glBindTexture-call) and not to use something so totally wrong and stupid as glVertex (even glDrawArrays will be way faster, and isn't any more difficult to use, it's not very "modern" though). With those two very simple rules the frame time should be atleast down to 10ms (100 fps). Now to get even more speed the next logical step is batching, e.g. bundling as many draw calls into one, for this you should consider implementing texture atlases, so you can minimize the amount of texture binds and thus increase the amount of rectangles you can draw with one call to a large amount. If you now aren't down to about 2ms (500fps) you are doing something wrong :)
Tile maps
Implementing the drawing code for tile maps is finding the balance between flexibility and speed. You can use static VBOs but that won't work with animated tiles or you can just generate the vertex data each frame and apply the rules I explained above, thats very flexible but by far not as fast.
In my previous answer I had introduced a different model in which the fragment shader takes care of the whole texturing, but it was pointed out that it requires a dependent texture lookup and thus might not be as fast as the other methods. (The idea is basically that you upload just the tile-indicies and in the fragment shader you calculate the texture coordinates, meaning that you can draw the whole map with just one rectangle)
Sprites
Sprites require a lot flexibility, making it very hard to optimize, aside from those discussed in the "About 2D-Performance" section. And unless you want ten thousands of sprites at the screen at the same time it's probably not worth the effort.
sumber
If all-else fails...
Set-up a flip-flop drawing method. Only update every other sprite at a time. Though, even with VisualBasic6 and simple bit-blit methods, you can actively draw thousands of sprites per frame. Perhaps you should look into those methods, as your direct method of just drawing sprites seems to be failing. (Sounds more like you are using a "rendering method", but trying to use it like a "gaming method". Rendering is about clarity, not speed.)
Chances are, you are constantly redrawing the whole screen, over and over. Instead of just redrawing only the changed areas. That is a LOT of overhead. The concept is simple, yet not easy to understand.
Use a buffer for the virgin static background. This is never rendered itself, unless there is no sprites on the screen. This is constantly used to "revert" where a sprite was drawn, to undraw the sprite in the next call. You also need a buffer to "draw on", which is not the screen. You draw there, then, once all drawn, you flip that onto the screen, once. That should be one screen-call per all your sprites. (As opposed to drawing each sprite on the screen, one at a time, or attempting to do it all at once, which will make your alpha blending fail.) Writing to memory is fast, and does not require screen-time to "draw". Each draw-call will wait for a return signal, before it attempts to draw again. (Not a v-sync, an actual hardware tick, which is a lot slower than the wait-time that RAM has.)
I imagine that is part of the reason you only see this issue on one computer. Or, it is falling back to software rendering of ALPHA-BLEND, which all cards do not support. Do you check to see if that feature is hardware supported, before you attempt to use it? Do you have a fallback (non-alpha-blend mode), if they do not have it? Obviously, you don't have code which limits (number of things blended), as I assume that would degrade your game content. (Unlike if these were just particle-effects, which are all alpha-blended, and thus, why programmers limit them, as they are highly taxing on most systems, even with hardware support.)
Lastly, I would suggest to limit what you are alpha-blending, to only things which need it. If everything needs it... You have no choice but to demand your users have better hardware requirements, or to degrade the game for the desired performance.
sumber
Create a sprite sheet for objects and a tile set for terrain like you would in other 2D game, there's no need to switch textures.
Rendering tiles can be a pain because each triangle pair needs their own texture coordinates. There's a solution to this problem however, it's called instanced rendering.
As long as you can sort your data in a way so that, for example, you can have a list of grass tiles and their positions, you can render every grass tile with a single draw call, all you have to do is provide an array of model to world matrices for each tile. Sorting your data this way shouldn't be an issue with even the simplest scene graph.
sumber