CraCra @reneyvane: ca sera un laboratoir pour valider et ajuster certaines chose pour la PS6, apres les beta testeurs jeux, les bêta testeurs console ^^ (il y a 3 Heures)
CraCra @Blackninja: elle sera 3fois mieux (il y a 3 Heures)
Blackninja @CraCra: va surtout falloir justifier la PS6 si dispo sur PS5 Pro (il y a 3 Heures)
reneyvane @CraCra: Trop tard ? Si dispo début Q2 2026, c'est 1 an & demi après sortie & peut être un an & demi avant PS6 ? Est-ce que les éditeurs tiers ne préféreront pas investir dans des versions Switch2 AAA (il y a 5 Heures)
CraCra @reneyvane: le prix ce justifiera peut être un peu plus (il y a 6 Heures)
reneyvane Neil Druckman annonce qu'il ne participera pas à la production de la 3iéme saison TV de TLOU afin de se recentrer sur son travail chez Naughy-Dog ! (il y a 18 Heures)
Driftwood Il est de nouveau possible de télécharger les vidéos sur le site. Désolé pour le mois et demi de panne. (il y a > 3 Mois)
Driftwood Retrouvez notre review de Rift Apart dès 16h00 aujourd'hui, mais en attendant Guilty Gear -Strive- est en vedette en home ! (il y a > 3 Mois)
Driftwood Nouveau live sur Returnal à 14h30 aujourd'hui. (il y a > 3 Mois)
Driftwood Rendez-vous à 17h00 pour un direct de 40 minutes sur Returnal (il y a > 3 Mois)
http://www.rngd.net
"Shut up and Play"
As for the X1900XT that is another story.
First, Xenos is not equivalent and this can be best be broken up in two areas: Shader Performance and Features.
Shader Performance. The X1900XT has 48 fragment shaders (mini ALU also I believe) + 8 high performance Vertex Shaders running @ 625MHz. Xenos has 48 unified shaders at 500MHz. The X1900XT has has over 2x the *overall* shading power.
Of course this gets tricky in that how resources are delegated. e.g. If the X1900XT pixel shaders are idle due to heavy vertex work (which does happen) then Xenos would fly ahead since all 48 shader ALUs would be working on vertex work. But it is safe to say in a Pixel Shader limited scenario the X1900XT is faster as (a) all 48 shader ALUs are dedicated to pixel shading whereas Xenos would have to use some unified shaders for vertex work and (b) its shaders are faster.
Features. The short answer is Xenos has a all the game related features X1900XT has and then some. e.g. The X1900XT does not have texture lookup on the vertex shaders. All 48 Xenos ALUs can do this. Xenos has FP10 blending (super fast HDR); X1900XT does not. Xenos has hardware tesselation and Higher Order Surfaces; X1900XT does not. Xenos has the ability to coherantly write to main system memory (MEMEXPORT), X1900XT does not. The 360 CPU can write directly from the L2 cache to Xenos; the X1900XT has to read from memory or across the FSB/PCIe (much slower). Xenos has eDRAM to isolate all the intensive framebuffer tasks (FP blending, Anti-Aliasing, alpha, z, color, etc); the X1900XT has a tweaked memory controller but still uses a general pool which makes heavy tasks, like HDR+AA, unrealistic.
I know these are "technical" terms, but to translate how these features help:
• Xenos can do curved surfaces (HOS).
• Xenos can do single pass displacement maps (HW tesselation + texture lookup in vertex shader)
• Xenos can do 4xMSAA and HDR effects with negligable penalty to performance
• Xenos can procedurally create data on the CPU and send a mesh to the hardware and have it tesselated on the fly for nice Level of Detail, i.e. LOD
• Xenos has fewer CPU<>GPU bottlenecks and Memory<>Memory bottlenecks making its shader performance the typical bottleneck (the X1900XT will wait on memory and the CPU a lot)
Everything is tradeoffs. I would not say Xenos is comparable to the X1900XT. The X1900XT is clear a better RAW shader performer. Xenos has more features and is a lot more effecient and flexible.
Which is better is dependant on the game design, but it would be overstating to say Xenos is comparable to the X1900XT as the X1900XT has some SIGNIFICANT advantages, but also lacks some important features and flexibility that can/do affect the end product. Basically an example of how Xenos is a custom design for a Console and how the X1900XT is made for PC games/PC platform. The nice thing is R600 will be introducing the Geometry Shader which does away with a LOT of the PC bottlenecks! :D
And of course when you take this "cross-platform" to Nvidia things get harder. e.g. G70@430MHz (7800GTX/256MB) and G70@550MHz (7800GTX/512MB) have a higher FLOP value for the programmable 32bit shader pipeline (199GFLOPs and 255GFLOPs, respectively) and yet the X1800XT beats them in F.E.A.R--yet only does 187GFLOPs for the 32bit programmable shader pipeline. The reason being the X1800XT is better at SM3.0 (branching and flow control). So EFFECIENCY is very important and this is why RAW NUMBERS are not always very telling.
This is one reason why soooo many people talk about Xenos and Unified Shaders and eDRAM and the cache locking feature (GPU reading from the CPU). Whereas the X1800XT's 187GFLOPs is split between pixel shaders and vertex shaders all 240GLFOPs of 32bit programmable shaders in Xenos can be used all the time for anything.
This is why comparing shader operations per second (which say little because those include 16 bit as well as mini-ALU which are not effeciently used and skew the numbers further) are not very valuable because GPUs do NOT do stuff the same way. I gave some good examples of this in how a 3.8GHz P4 does 2x the integer operations per second of an AMD 3800+, but the AMD whoops butt in integer benchmarks. Why? Because *operations* don't tell us about architecture or utlization/effeciency. This is why the below chart I pulled from a forum, while technically true (these numbers are derived from the Sony slides where they gave the shops and gflops of RSX which inturn indicates a lot about the shader pipeline... Yes Jolli, this Sony could have changed the design but this is what we have and this is what Sony says so lets leave it at that), is more of an obsession with numbers than looking "holistically" at the entire big picture:
48shaders * 10FLOPs per shader * 500Mhz = 240 GFLOPs
RSX:
24pixel shaders * 16FLOPs per shader * 550Mhz = 211.2 GFLOPs
8Vertex shaders * 10FLOPs per shader * 550Mhz = 44 GFLOPs
PS3 211.2 + 44 = 255.2 GFLOPs > 240 GFLOPs Xbox360.
From a console perspective I really would not worry about it. Cross platform games are just that. Of course the problem this gen is the 360 and PC play nice but Sony really requires games to be custom designed (or rebuilt) for CELL. But as far as graphics go, Sony 1st party games will leverage the PS3 and build engines and art around the excellent design; and MS 1st party games will leverage the 360 and build engines and art around their excellent design.
This is why the GCN and PS2's 1st party games or exclusives (stuff like RE4 and stuff made by ICO) look just as good as Xbox 1st party games. And since disparity is minor between the two consoles ports should be very similar.
As for the PC, DX10 is a totally new "platform" and should yield some jaw dropping results come 2007/2008. I personally would not buy a new GPU right now, as much as the X1900XT appeals to me. DX10 is gonna solve a lot of issues on the PC and leap frog the consoles in performance and features.
www.cyberwarriors.nl
I'll try to summarize he said rsx 24 pipes and is capable 5.7 ops per pipe = 136 shader ops per clock.
Although due limitations of the dedicated pixel and vertex pipe setup the true performance for the rsx's ops per pipe is roughly 3.12 per pipe.
So 24 pipes x 3.12 ops per pipe = 74.8 billion shader operations per second.
The 360's xenos on the other hand 48 unified pipes and is capable of 2 ops per pipe = 96 shader operations per clock (which jollipop seems to have been right about,) BUT due to the the unified shader architecture and how amazingly efficient the pipelines of the xenos are the 2 ops per pipe is actually fully achieved.
48 unified pipes x 2 ops per pipe = 96 shader ops per clock
48 unified pipes x 2 ops per pipe = 96 billion shader operations per pipe.
The rsx is roughly 67-68% efficient in the operation of its pipelines. Due to the advantage given from the 360's xenos having double the pipelines and actually being 95-99% efficient instead of 67-68% like the rsx (That efficiency will get worse from things such as hdr, aa, or softshadows.)
He also said some other stuff too, but i'd have to talk to him another time. He said companies quoting theirr shader ops per clock are useless.
"That stuff is so strong, it's religious"
Even at 100% efficiency (2 shader ops per pipe) at 500 mhz and 48 pipelines you still only achieve 48 billion shader ops per second.
48 unified pipes x 2 ops per pipe = 96 billion shader operations per pipe.
How exactly does the RSX lose so much performance of its shader's?
I could understand it losing performance to stalling of the pipelines, but we are talking about theoretical numbers here not in game.
I understand that unified shaders by their design are much more efficient at work loads (in game) because they can lend themselves to either pixel or vertex functions depending on which needs to be finished first.
This has run its course anyways, so lets agree to disagree.
You are right though we should agree to disagree ;)
Next time my friend is over i'll get him to type that stuff up again.
Lets for a moment assume 74B and 48B are correct for the 32bit programmable shader pipelines and *assume* they are counted the same etc yada yada yada Why is it that
RSX's 74B shader op/s = 255GFLOPs
Whereas Xenos' 48B shader op/s = 240GFLOPs
How screwy is that? It takes 200M shader operations to produce 1GFLOPs on Xenos and yet it takes 290M shader operations to produce 1GFLOPs on RSX.
So either they (a) are not comparing the same thing or (b) the number of shader operations to produce a FLOPs is different in their architectures or (c) both! E.g. a vector4 + scalar op = 2 shader ops; likewise a vector3 + scalar op = 2 shader ops. Of course the former has a higher performance.
Using shader operations/second is worthless. By this method we would have to conclude the X850XT (43B shader op/s) is almost as fast as Xenos (48B shader op/s).
Since I found the quote; this is from Marco (user: nAo) over at beyond3d.com. He is a confirmed PS3 developer.
Shader ops a meaningless marketing term!
Likewise, while FLOPs *are* relevant, they say nothing of design and how effeciently they are used. This is why an X1800XT (187GFLOPs) can best a 7800GTX/512MB (255GFLOPs) in some shader intensive games. Just playing with numbers ignores things like branching, flow control (both SM3.0 features that directly impact effeciency), memory utilization, pipeline stalls, disparity between shader models (e.g. unified versus traditional), etc.
I cannot believe I am saying this, but I wish they had a series of ShaderMark and 3DMark tests for consoles. MS and Sony have sooooo abused marketing terms and got people looking at numbers like shader ops (worthless!) and FLOPs (misleading!).
But it is all in history.
SNES vs. Genesis: How many colors on screen at one time
N64 vs. PS: MIPs (million instructions per second); Polygons per second; Frequency
PS2 vs. Xbox: Polygons per second; FLOPs; Frequency
PS3 vs. Xbox 360: FLOPs; Shader operations per second
You know why companies use these numbers? Because casual fans don't understand what they mean--they just know BIGGER=BETTER. They also don't want to understand in many cases. MS and Sony do this and this is why they play number games.
Slides on how ToyShop was done. This should explain the process of how they used shaders to make the demo :D
Dynamic Parralax Occlusion Mapping
Soft Shadows Computation
Adaptive Level of Detail system
Think ghost recon might be using a bit of all 3 what do you think acert?
I know this one. Ignoring all the la-de-dah lets just say that that RSX texture units have the exact same performance as the Xenos, but is 10% faster.
So to equal Xenos' 16 Texture units, RSX will have to dedicate 14 (16 texture units - 10% = 1.6 = 14.4) of it's texture units to process textures.
Why does this affect RSX's shading abilities? Because each texture unit is coupled to one of 48 shader ALU's in the 24 pixel pipelines, wheras they are uncoupled in the Xenos, so texture processing does not take up an ALU
When RSX is processing the texture the shader alu's cannot be used. So it's really 74B less 29% of it's shader ops.
Assuming these texture units are being used 100% of the time and the pixel units are 100% efficient that's a reduction to 52.5 Billion shader ops.
From there it all comes down to how well Xenos load balances and how much RSX stalls.
Basically they are mutually exclusive since they are not tied to eachother, although whatever one bottlenecks first affects everything else.
But you are correct: While Xenos can do HDR (FP10) and 4xMSAA together at a minor hit, RSX can only do one or the other and both HDR and AA take up a bit of bandwidth.
My guess is most devs will use DOF and Motion Blur effects, no AA, and use HDR.
There is no pipeline--it is a totally new way of thinking of the GPU. All things being even, generally speaking:
1 traditional "Pipeline" > 1 unified ALU
Typically the term "low yields" refers to poor production of the chips either because (a) not enough are hitting the targer frequency or (b) because they do not work at all due to defects.
How this is handled on the PC side is a tiered structure. For (a) they offer a number of chips. e.g. The same AMD64 chip will be sold at various frequencies (1.8, 2, 2.2, 2.4 GHZ etc) dependant upon the chips themselves. They test them--the best chips are binned as 2.4, and the worse at 1.8. With consoles, since you have only 1 frequency, they either work or they do not. So a Xenos chip that runs at 450MHz is scrapped. If too many chips don't hit 500MHz "yields are low".
As for (b) the solution has been on the PC and Console side is redundancy. For perfect chips, on the PC, they are sold as flagship models. e.g. The 6800 Ultra has 16 pixel shaders. The 6800 Vanilla on the other hand, the same chip, has only 12. Why? Because one quad did not work so they disabled them.
To improve yields on consoles redundancy is often designed in. e.g. Xenos has 4 full shader arrays (64 shader ALUs). Since Xenos can ONLY be sold at a set design, going with all 64 ALUs enabled would kill yields. Instead of getting 50% yields they may only get 20%. If a chip can work with 25% defective that improves yield because typically the odds of ONE defect are much higher than TWO defects. Just the law of odds on CMOS. Kind of like kids. You have a 50% chance every time of having a boy or girl, but having a boy before does not improve the odds of having a girl.
For this reason Xenos has redundancy in the parent and daughter die. Ditto CELL which disables an SPE. Since SPEs make up a good portion of the chip, by disabling one for redundancy improves yields. We can do some simple math on this.
• Lets say the odds are there will be 1 spec of dust (a defect) on 50% of CELL dies.
• Lets say 50% of the die space is the PPE and 50% is SPEs.
• We can assume that 25% of dies are junk off the top because if an PPE is defective the entire chip wont work. (i.e. 50% of space is PPE; 50% defect rate; 50% * 50% = 25%)
• With a CELL requires all 8 SPEs to work will also mean 25% of dies are junk.
• But if we only require 7 SPEs to work, that means in such a scenario no dies would need to be junked due to a single SPE defect.
The result:
1PPE + 8 SPEs = 50% yields
1PPE + 7 SPEs = 75% yields
This is a REAL simplistic example as there are many factors besides physics defect like frequency that matter in addition to stuff like a defect could happen between two SPEs and ruin both etc. But the principle is the same.
Of course sometimes with smaller dies it makes sense to have no redundancy because the extra space used for redundnacy just means more chance of a defect.
The only use of "yield" in this thread is about how the new PC stuff should "turn our" (= yield) some great graphics: