G71 (RSX) information is out

hasanahmad
hasanahmad
Inscrit depuis 7593 Jours
being the fact that G71 is a derivate of G70 just like RSX, and G71 is 90nm just like RSX comapred to 110nm G71 we can SAFELY assume the G71 is the architecture to the RSX of Sony

Heres the info

655 Mhz Core > RSX 550 Mhz Core (already told)
256-bit interface > RSX 128-bit interface (already information given)
52 Gb/s Memory Bandwidth > RSX Memory bandwidth (anyone have the number)
15 Billion pixels/s fill rate > RSX slightly lower due to lower Mhz Core and Mem Bandwidth
1450 million Vertices/s > RSX lower due to lower Mhz and Mem bandwidth
24 pipes - This is the information people wanted. So we can safely assume RSX is 24 pipes

http://www.dailytech.com/article.aspx?newsid=915
En réponse à
RAZurrection
RAZurrection
Inscrit depuis 7450 Jours
Doesn't the 7800 512MB already have substancially higher memory speed/bandwidth than the RSX specs profess it to be?

Given that they've confirmed that it's based on the 7800 (confirmed by NV several times) and it shares the same clock/ pipelines -and on paper performance in many areas - it would appear to have more in common with the 7800 512MB than this derivitive.
En réponse à
Jollipop
Jollipop
Inscrit depuis 7460 Jours
1.45 billion vertices/s ... holy cow..!! (Toy Story graphics are finally here......well almost)

Whos to say that the G71 isn't closer to the RSX than the G70.

Could the RSX not have 2 x 128 bit interfaces, possibly for both memory pools? (this is just me speculating of course)
En réponse à
hasanahmad
hasanahmad
Inscrit depuis 7593 Jours
Jollipop RSX has 1.1 billion vertices, Xbox 360 has 500 million triangles setup rate which is equal to around 1.2 billion vertices. Thats Toy story. I knew I would catch you with that figure:P. And no RSX has only 128 bit interface. If you take into account the efficiency of Xenos. RSX will process 800 million vertices ingame while Xbox 360 will do 1 Billion
En réponse à
RAZurrection
RAZurrection
Inscrit depuis 7450 Jours
Posté par Jollipop
1.45 billion vertices/s ... holy cow..!! (Toy Story graphics are finally here......well almost)

Whos to say that the G71 isn't closer to the RSX than the G70.

Could the RSX not have 2 x 128 bit interfaces, possibly for both memory pools? (this is just me speculating of course)
Well the RSX shares clocks with a 7800 512MB, but has less bandwidth and slower memory - differences to this 7900 are even greater

I believe it does have 2x 128 bit interfaces for both pools, but that's a whole other issue.
En réponse à
hasanahmad
hasanahmad
Inscrit depuis 7593 Jours
Heres the real information sorry:


Assuming perfectly staggered threading,
48 units / 4 instructions = 12 vertices per cycle * 500 MHz = 6 billion.
En réponse à
Jollipop
Jollipop
Inscrit depuis 7460 Jours
Jollipop RSX has 1.1 billion vertices, Xbox 360 has 500 million triangles setup rate which is equal to around 1.2 billion vertices.
The conversion of vertices to polygons isn't quite as simple as that, but im not guna argue, where did you get the 1.2 billion vertices number from, just out of curiosity.
Assuming perfectly staggered threading,
48 units / 4 instructions = 12 vertices per cycle * 500 MHz = 6 billion.
Assuming a great deal more than that, like bandwidth, the fact that the shaders are all operating on vertex calculations also.

I think MS/ATI said that the theocratical maximum would be 6 billion, assuming all pipes were working on vertex calculations but the bandwidth would only allow 500 million to be rendered, even in this case you still have no pixel shader calculations.

It will be interesting to see what the RSX will be capable of after the bandwidth has watered its output down.
En réponse à
RAZurrection
RAZurrection
Inscrit depuis 7450 Jours
I heard it was 6.1 billion

You can check it out here

http://www.beyond3d.com/forum/showthread.php?p=540...

I don't get it meself, a polygons a polygon
En réponse à
Jollipop
Jollipop
Inscrit depuis 7460 Jours
Jollipop RSX has 1.1 billion vertices, Xbox 360 has 500 million triangles setup rate which is equal to around 1.2 billion vertices.
The conversion of vertices to polygons isn't quite as simple as that, but im not guna argue, where did you get the 1.2 billion vertices number from, just out of curiosity.
Assuming perfectly staggered threading,
48 units / 4 instructions = 12 vertices per cycle * 500 MHz = 6 billion.
Assuming a great deal more than that, like bandwidth, the fact that the shaders are all operating on vertex calculations also.

I think MS/ATI said that the theocratical maximum would be 6 billion, assuming all pipes were working on vertex calculations but the bandwidth would only allow 500 million triangles to be rendered, even in this case you still have no pixel shader calculations.

It will be interesting to see what the RSX will be capable of after the bandwidth has watered its output down.
En réponse à
Acert93 - Mr. Bad Cop
Acert93
Inscrit depuis 7548 Jours
Looking good. Not much of a surprise really since Sony dev kits have been using 6800U and 7800GTX GPUs as a base design to go forward on development. Not to mention a couple devs straight out saying, "RSX is a modified NV40/G70". ATI in more than one interview has called RSX a G70 derivative. Which of course all goes back to the clock-performance (shader ops, flops per cycle) slides at E3 (which indicate a G70 derivative at 550MHz) and the NV stock holder statements that RSX would be a G70 derivative and "therefore will cost very little in new investment". I don't expect NV to lie to their stockholders about the R&D costs of such a deal considering it could mean prison time.

That said, lets look at some of the numbers.
256-bit interface > RSX 128-bit interface
Technically true... but there is more to it. RSX has a 128bit interface to the 256MB GDDR3 pool (22.4GB/s). But it ALSO can access R across the FlexIO which offers 25.6GB/s in bandwidth just to R memory bandwidth.

While CPU intensive games are going to need 10-20GB/s of that bandwidth, RSX will have more than 22GB/s of bandwidth. I am guessing a lot of initial games will be more eye candy (as devs figure out CELL) and so a good portions of the 48GB/s will be available to RSX. I think this namely because floatingpoint type math can work with small applets and produce a lot of information separate from main memory accesses in a lot of situations (which will be necessary with the SPEs) and because a top end CPU like an AMDFX-57 does pretty solid on top end PC games with only 6.4GB/s of memory bandwidth.
52 Gb/s Memory Bandwidth > RSX Memory bandwidth (anyone have the number
GDDR3 22.4GB/s
R 25.6GB/s
-------------------
48GB/s
15 Billion pixels/s fill rate > RSX slightly lower due to lower Mhz Core and Mem Bandwidth
That puts it at 12.5B for RSX. This will take a big hit at higher resolutions with AA, but I don't think realistically many games will be going for anything other than 720p anyhow. Should be more than adequate.
1450 million Vertices/s > RSX lower due to lower Mhz and Mem bandwidth
I will have to check, but I think that is the setup engine (e.g. Xenos can setup 500M vertices/triangles a second--one per clock--or tesselate 250M per second--one per two clocks; yet the shaders themselves can transform like 6B+ triangles).

The reality check is this: Triangle Meshes are HUGE. They consume a lot of memory bandwidth and memory footprint (unless you procedurally generate them). This is why normal mapping is very popular. The detail of a 20MB mesh can be stored in a couple MB normal map and also be processed by pixel shaders. Lower detail meshes also means faster shadows and less work in transforming them for animation. This is why games like D3 and UE3 are Normal Map heavy.

So the number being huge is nice because it means it wont be a bottleneck; on the other side it does not tell us too much. To put it into perspective of a 720p screen:

720p = 921,600 pixels
720p * 60fps = 55,296,000 pixels
1.45B / (720p * 60fps) = 26.2

What this means is that PER FRAME, RSX theoretically can have 26.2 triangles *per pixel*. (Xenos, with the 500M setup limit, is 'limited' to 9 triangles per pixel at 720p).

Baically theoretically these things can setup more pixels than you could ever see. Practically there are many other bottlenecks in the systems that make these numbers irrelevant.
24 pipes - This is the information people wanted. So we can safely assume RSX is 24 pipes
The E3 slides always indicated G70-like shader pipeline @ 550MHz. Of course this is NOT an official announcement of RSX, but then again everything lines up right.

Overall looking hot.

As for G71, a nice ~20% core clock boost over the 7800GTX 512MB. On the down side no improvement on the memory. Unless they did some major tweaking to their SM3.0 implimentations I am not sure this is good enough to best the X1900XT. Why? Because in shader heavy games the X1800XT already edges the G70@550MHz in FEAR, CoD2, and ties in BF2 and the X1900XT walks away. A 20% boost in core performance is GREAT, but ATI took a gamble and increased their pixel shader performance peak to 3x the X1800XT. Of course most current games don't need that much extra pixel shader performance and the G71 is gonna get a boost to ROPs and TMUs as well to shaders, but future looking as shaders become more important I think ATI made the better bet. I expect a splitting of a lot of benchmarks, with G71 taking a lot of games that use heavy texturing and R580 taking the more shader heavy games. Choice is good :D

Of course it is all moot as $500+ GPUs are not the mainstay of most consumers. All bragging rights.
En réponse à
hasanahmad
hasanahmad
Inscrit depuis 7593 Jours
its not a matter of bandwidth. because D3d Compression technology allows more than 2-3x compression of textures

the 6 billion figure is the transform rate. the theoritically MAXIMUM figure a game can output . THEORITICALLY

the 1.2-1.5 billion figure for Xenos is the real ingame performance it can achieve easily.
En réponse à
Acert93 - Mr. Bad Cop
Acert93
Inscrit depuis 7548 Jours
Whoa! Slow down guys! I say the first 3 posts when I started, wrote, posted and now there is like a dozen! Slow down for me!
Posté par Has
Xbox 360 has 500 million triangles setup rate which is equal to around 1.2 billion vertices
Vertex = Triangle, at least for this type of math. In game usually there are 1.2-1.5 verticies per triangle.

(Just think of it this way: Every time yyou add a new vertex you can link it to a number of others which produces more and more triangles).

Heh, after skimming the rest of the thread it seems my 6B max for all 48ALUs on Xenos was right (from memory). Of course as Jolli is saying we will NEVER, EVER see that. Ever. (Period).
En réponse à
hasanahmad
hasanahmad
Inscrit depuis 7593 Jours
Acert. the figure Nvidia put for RSX was 1.1 Billion vertices per second != 500 million triangles per second. Xenos would be 1.2-1.5 Billion vertices per second. We dont know RSX/G71 triangle setup rate and we dont know Xenos vertices rate
En réponse à


hasanahmad
hasanahmad
Inscrit depuis 7593 Jours
Acert are you talking about setup rate or transform rate because the ATI X800 can do 650 Billion vertices per second
En réponse à
Acert93 - Mr. Bad Cop
Acert93
Inscrit depuis 7548 Jours
its not a matter of bandwidth. because D3d Compression technology allows more than 2-3x compression of textures

the 6 billion figure is the transform rate. the theoritically MAXIMUM figure a game can output . THEORITICALLY

the 1.2-1.5 billion figure for Xenos is the real ingame performance it can achieve easily.
You are confusing the ability of the shaders with the setup.

Xenos can *setup* 500M triangles--max.

In theory, if the above limitation was not set, it could transform 6B triangles if every resource was dedicated to such and no other bottlenecks were hit.

*But* vertex shaders do more than just transform verticies. e.g. some shadowing techniques and displacement maps use vertex shaders to produce their effects.

If all Vertex Shaders did was transform meshes then we could still have a Fixed Function T&L setup.

You are right about D3D compression from Xenon (believe it is 2 fold compression making the 10.8GB/s one way effectively 21.6GB/s) but remember, at its best, Xenon is going to be able to do 1/3 the work Xenos could do--and that is theoretical (no way a CPU is going to perform as well as a finely tuned multithreaded GPU pipeline). Anyhow, if Xenon pushes 2B triangles to Xenos it cannot handle them--because, again, it is setup limited to 500M/s.

Which of course is more than we need, even with significant overdraw.

So, as I said yesterday, lets not get hung up on irrelevant numbers. Lets just toss out shader ops and setup limits or peak vertex transformations because none of them tell us anything about the power of a GPU, its bottlenecks, or how it will apply to real games.
En réponse à
Jollipop
Jollipop
Inscrit depuis 7460 Jours
Vertex = Triangle, at least for this type of math. In game usually there are 1.2-1.5 vertices per triangle.
Is this an average ?

I was always told the larger the object the less vertices are used, and obviously the more objects the more you use, it's would be interesting to know how they benchmark this.

I assume maximum triangle set up would how many individual triangles a GPU can draw per second, and maximum vertices set up would be how many vertices can be drawn as one solid object per second..
En réponse à
hasanahmad
hasanahmad
Inscrit depuis 7593 Jours
sorry i meant to say ATI X800 can do 650 million vertices per second
En réponse à
hasanahmad
hasanahmad
Inscrit depuis 7593 Jours
The X850 XT made available for the PC features slightly faster RAM and a slightly faster engine clock, so it yields moderately higher performance than the X800 XT for the PC: A pixel fillrate of 8.32 gigapixels per second and a geometry rate of 780 million triangles per second, compared to 8 gigapixels and 750 million triangles. It is unknown as MacCentral posted this article how the Mac version of the X850 XT differs from its PC counterpart.

http://www.macworld.com/news/2005/06/28/radeon/ind...
En réponse à
Acert93 - Mr. Bad Cop
Acert93
Inscrit depuis 7548 Jours
Posté par hasanahmad
we dont know Xenos vertices rate
Yeah we do, it is 500M vertices. One vertex per clock.

Think of a octagon (6 sided poly) with a dot in the middle. Draw a line from every corner of the octagon to the middle dot. The 6 points of the of the octagon + the 1 point in the middle can form 6 triangles. 7 vertices, 6 triangles.

In larger meshes you can make it so 1 triangle = 1 vertex.

And thus in GPU nomenclature 1 triangle is the same as 1 vertex in regards to such. And besides this it was confirmed numerous times on B3D.
Acert are you talking about setup rate or transform rate
Both. Depends what part you are asking about. What part/numbers is confusing you?
En réponse à
hasanahmad
hasanahmad
Inscrit depuis 7593 Jours
this one

the Radeon X700 Pro has roughly the same fill rate as the Radeon 9800 XT (3.4 gigapixels) but 55 per cent better geometry performance (637.5 million triangles per second, as opposed to 412 million for the 9800 XT).

http://www.pcworld.idg.com.au/index.php/id;1854767...

X700 can do 637 million triangles per second vs 9800XT can do 412. so that doesnt mean the Xenos is between the 9800 XT and X700 does it. if its not whats the triangles per second rate in comparison with the above cards
En réponse à
Acert93 - Mr. Bad Cop
Acert93
Inscrit depuis 7548 Jours
Posté par Jollipop
Is this an average ?
I got that number from a developer (ERP). It is variable and depends on the mesh size and design.
I was always told the larger the object the less vertices are used, and obviously the more objects the more you use, it's would be interesting to know how they benchmark this.
Most 3D apps, especially ones with mesh optimization features, will output this information for you Not sure I understood the first part about larger objects have fewer vertices? Larger how?
I assume maximum triangle set up would how many individual triangles a GPU can draw per second, and maximum vertices set up would be how many vertices can be drawn as one solid object per second..
Setup = the maximum number of vertices that can be setup for the pipeline. Since everything travels through the pipeline you can never have more than the GPU can setup. The setup number is basically => How many can the GPU "catch" and use.

GPUs, ever since the advents of programmable vertex shaders, have been able to transform more vertices than they could setup. This is because the VS are expected to do more than just push the mesh through.

If you check my posts the other day to the Displacement Maps you can get an idea of WHAT the power of vertex shaders can be used for. This is why they have more power than the setup limit--basically, if they were the same (or less) you could do NO vertex shader effects less setup the triangles.

I think VS are often misunderstood. This was the state of offairs before and after VS were first introduced waaaay back when:
Why use Vertex Shaders?
If you use Vertex Shaders, you bypass the fixed-function pipeline or T&L pipeline. Why would you want to skip them?

Because the hardware of a traditional T&L pipeline doesn't support all of the popular vertex attribute calculations on its own, processing is often job shared between the geometry engine and the CPU. Sometimes, this leads to redundancy.

There is also a lack of freedom. Many of the effects used in games look similar with the hard-wired T&L pipeline. The fixed-function pipeline doesn't give the developer the freedom he need to develop unique and revolutionary graphical effects. The procedural model used with vertex shaders enables a more general syntax for specifying common operations. With the flexibility of the vertex shaders developers are able to perform operations including:

Procedural Geometry (cloth simulation, soap bubble [Isidoro/Gosslin])
Advanced Vertex Blending for Skinning and Vertex Morphing (tweening) [Gosselin]
Texture Generation [Riddle/Zecha]
Advanced Keyframe Interpolation (complex facial expression and speech)
Particle System Rendering
Real-Time Modifications of the Perspective View (lens effects, underwater effect)
Advanced Lighting Models (often in cooperation with the pixel shader) [Bendel]
First Steps to Displacement Mapping [Calver]
And there are a many more effects possible with vertex shaders, perhaps effects that nobody thought of before. For example a lot of SIGGRAPH papers from the last couple of years describe graphical effects, that are realized only on SGI hardware so far. It might be a great challenge to port these effects with the help of vertex and pixel shaders to consumer hardware.

In addition to opening up creative possibilities for developers and artists, shaders also attack the problem of constrained video memory bandwidth by executing on-chip on shader-capable hardware. Take, for example, Bézier patches. Given two floating point values per vertex (plus a fixed number of values per primitive), one can design a vertex shader to generate a position, a normal and a number of texture coordinates. Vertex Shaders even give you the possibility to decompress compressed position, normal, color, matrix and texture coordinate data and to save a lot of valuable bandwith without any additional cost [Calver].

And there is also a benefit for your future learning curve. The procedural programming model used by vertex shaders is very scalable. Therefore the adding of new instructions and new registers will happen in a more intuitive way for developers.
http://www.gamedev.net/columns/hardcore/dxshader1/...

Vertex Shaders are even MORE capable now; and while they are frequently ignored over PS due to some limitations (now disappearing--like being able to fetch a texture) they are becoming more and more important.

Hopefully that helps explay why Vertex Transformation numbers > Vertex Setup numbers. Basically the VS units have to do a lot more than just setup the triangles--but of course the GPU can NEVER output more triangles than the setup limit.

Which is irrelevant due to the target resolution (720p) and the huge size of a 1.45B triangle mesh!
En réponse à
hasanahmad
hasanahmad
Inscrit depuis 7593 Jours
so that means Xenos triangle per second is less than X700 (637 million triangles per second) and alot less than X850 (750+ million triangles per second)
En réponse à
Acert93 - Mr. Bad Cop
Acert93
Inscrit depuis 7548 Jours
Posté par hasanahmad
X700 can do 637 million triangles per second vs 9800XT can do 412. so that doesnt mean the Xenos is between the 9800 XT and X700 does it. if its not whats the triangles per second rate in comparison with the above cards
Of course note--that is like saying the Radeon X850XT is almost as fast as Xenos since it can do 43B shops.

These are all just "limits". The reason some GPUs have really high numbers that look out of place is because they are derivative models. e.g. The X700 has a high frequency which inflates the setup rate, but since that high of a number is completely unusable you will see GPU makers cut down on the silicon to setup up triangles as the frequencies go up because there is no reason to have a GPU create 10B triangles.

Die space is precious so no point wasting it on stuff never needed.
the Radeon X700 Pro has roughly the same fill rate as the Radeon 9800 XT (3.4 gigapixels)
Just to bring it up... we can remove Fillrate as a metric as well. In almost every case in modern games, minus reaaaaaaly high resolutions with a lot of AA, you wont be fillrate limited.

And even then it is misleading. "Oh no, Xenos only does 4GP!" Yeah, but (a) it does 16 gigasamples when AA is enabled (i.e. no penalty for 4xAA) and (b) with the eDRAM this is a "no stop" real, goodness to life number. A typical GPU, even with 16Gigapixel fillrate, is NEVER in its wildest dreams even get close to that number in a real game.
En réponse à
RAZurrection
RAZurrection
Inscrit depuis 7450 Jours
I think they commented on it's efficiency regarding those counts.

The X850 in an actual benchmark would only get 100 million polygons out (due to bottlenecks, stalling, inefficiencies etc) but Xenos would be capable of actually performing much closer to its 500 million polys per second, despite the lower theoretical output
En réponse à
hasanahmad
hasanahmad
Inscrit depuis 7593 Jours
here is teamxbox : http://features.teamxbox.com/xbox/1145/The-Xbox-36...

The polygon performance of the Xbox 360 is as high as 500 million triangles per second, which means the Xbox 360 can process aproximately 1.2 billion vertices per second. In comparison, the GeForce 6800 Ultra can process 600 million vertices per second.

the same thing is on B3D. 6 Billion transform rate. 1.2 Billion vertices per second
En réponse à
Il faut etre identifie pour participer au forum !
Patreon

135 $ de 400 $ par mois

Quoi de neuf ?
  • GTB

    GTB @Zega: Sujet trop vaste pour la shout. Et je ne parle pas de ça. Je parle des raccourcis énormes. (il y a 3 Heures)

  • Zega

    Zega Je vois pas ce qui est compliqué. C'est un problème sociétal en Corée du sud. Et la Kpop est un point de crispation de ces dérives. (il y a 9 Heures)

  • GTB

    GTB Alors que la Kpop cartonne dans le monde, avec des meufs/mecs de 20-30ans qui ont des visages d'ado; peut-on vraiment découvrir la mode coréenne avec Stellar 2 ? (il y a 21 Heures)

  • GTB

    GTB @face2papalocust: C'est un peu plus compliqué que ça. Je ne critique pas la question, mais les réponses sans demi mesure. (il y a 21 Heures)

  • face2papalocust

    face2papalocust Trop bien smalland 2 annoncé <3 [url] (il y a 21 Heures)

  • face2papalocust

    face2papalocust @GTB: Désolé si ca choque mais sexualiser une image infantile oui c'est le début de l'horreur. (il y a 21 Heures)

  • Blackninja

    Blackninja @GTB: vu la tête de l’héroïne ça ne m’étonne pas (il y a 1 Jour)

  • Driftwood

    Driftwood Il est de nouveau possible de télécharger les vidéos sur le site. Désolé pour le mois et demi de panne. (il y a > 3 Mois)

  • Driftwood

    Driftwood Retrouvez notre review de Rift Apart dès 16h00 aujourd'hui, mais en attendant Guilty Gear -Strive- est en vedette en home ! (il y a > 3 Mois)

  • Driftwood

    Driftwood Le live commence d'ici 30 minutes, voici le lien GSY [url] et celui de Twitch [url] (il y a > 3 Mois)

  • Driftwood

    Driftwood Nouveau live sur Returnal à 14h30 aujourd'hui. (il y a > 3 Mois)

  • Driftwood

    Driftwood Le stream via Twitch, ici : [url] (il y a > 3 Mois)

  • Driftwood

    Driftwood Le stream maison ce sera ici : [url] (il y a > 3 Mois)

  • Driftwood

    Driftwood Rendez-vous à 17h00 pour un direct de 40 minutes sur Returnal (il y a > 3 Mois)

Aussi sur Gamersyde

Un nouveau Nintendo Direct

  • Mardi 9 juin 2026
  • davton

Le Future Games Show: Summer Showcase 2026

  • Samedi 6 juin 2026
  • davton