1. Deze website gebruikt cookies. Door deze website verder te gebruiken, gaat u akkoord met ons gebruik van cookies. Leer Meer.

Xbox 360 vs. Playstation 3 vs. Nintendo Wii [deel 6]

Discussie in 'Xbox Hardware' gestart door CaptnCook, 2 jul 2006.

Topicstatus:
Niet open voor verdere reacties.
  1. 36O

    36O Guest

    @Plasticman NL

    Aaah, nu is de boel een stuk duidelijker geworden, ja, als je het op die manier steld.
    Maar of je theorie ook daadwerkelijk klopt komen we pas achter als de PS3 uit is...

    Uiteraard niet gegarandeerd, maar meestal is dat wel het geval, kijk maar naar de vorige generatie.

    Das waar, heb ook nooit beweert dat Sony zo goe/succsesvol is in het creëenvan CPU's.
    Alleen mijn vertrouwen in de CELL is wel een stuk groter dan destijds in de Emotion Engine, alleen al omdat een bedrijf als IBM mee doet.

    Moewhaa, valt mee, er is op dit moment geen ëen PC CPU die meer dan 2 kernen bevat, of geen ëen GPU die over USA beschikt (of over een basisstructuur van Vector 5, daar is de Xenos toch echt de enige in).

    @Arjan

    Uiteraard, maar meestal is het toch zo dat men meer geintereseerder is in een Next-Gen console die net uit is?

    Een medewerker van IBM had al aangegeven dat de CELL meer kracht zou hebben dan de Xenon, de CELL zou wel mindergeschikt zijn voor games (omdat het moeilijker is om op te programmeren).

    En wat betreft de HD, dat zou best logisch klinken aangezien de Premium System (met HD) ook veel beter presteerd op gebied van framrate bij games dan de Core System (zonder HD).

    Ik denk ook dat Blu-Ray zijn voordelen in de toekomst gaat laten zien.
    Ik denk echt dat dan het verschil groter zal zijn (tussen 360 en PS3 games).
    - Ten eerste zullen de games gewoonweg groter zijn, zou erg van pas komen bij grote RPG's als bijv. Oblivion maar ook GTA alike games.
    -Hoe grotere de oplslag media, hoe meer ruimte er is om game informatie/details op te slaan, wat weer resulteerd in mooiere GRFX.
    Klikt toch logisch?

    Uhu, maar is het niet zo dat de PS3 en de 360 nog steeds een (kleine) voorsprong hebben t.o.v. de huidige PC's?
    Volgens MS duurt deze voorsprong tot eind 2007, maar dat moeten we maar met een korrel zout nemen denk ik...

    Maar dat is dan toch niet de uiteindelijke versie? of vergis ik me?

    ATIhad een verklaring waarom de Xenon krachtiger zou zijn dan de RSX ondaks dat deze op een hogerekloksnelheid draaide.
    Ze zeiden dat USA die 50 MHZ met gemak kon counteren.
    Ik had het namelijk over de uiteindelijke prestatie.

    @Macdennnis

    Nee jongen valt wel mee, leest snel verder zou ik zeggen!

    @360man: Sony heeft ongeveer 2x zoveel tijd gehad om aan de PS3 te werken dan MS aan de 360.

    @Supahmaster: met zo iemand als jouw neem ik geeneens de moiete om te discuseeren, simpele ziel.
     
    Laatst bewerkt door een moderator: 16 jul 2006
  2. Arjan

    Arjan XBW.nl VIP XBW.nl VIP

    Berichten:
    7.207
    Leuk Bevonden:
    3
    Inderdaad, dat ben ik ook met je eens :).

    Het verschil in het wel of niet cachen m.b.v. een harde schijf is best groot ja. Maar ik doelde meer op het feit dat een grotere harde schijf meer snelheid zou geven m.b.t. het cachen van data. Ik wil je ook helemaal geen ongelijk geven, je hebt zeker een punt. Ik denk dat de complexiteit c.q. de groote van de games dadelijk veel invloed zullen hebben op de groote van de cache.

    Dat zeker, in het begin loopt een nieuw formaat toch nooit storm, aangezien we het over een zeer kleine doelgroep hebben. En omdat er natuurlijk twee nieuwe formaten op de markt zullen verschijnen. Veel mensen kijken toch eerst de kat uit de boom.

    Verder ben ik wel van mening dat de capaciteit van een Blu-Ray schijfje een positieve invloed zal hebben op de complexiteit van toekomstige games. Of het grafisch ook allemaal toeneemt blijft discutabel. Het klinkt enerzijds als een logisch gevolg, maar het hoeft niet (zou mooi zijn van wel natuurlijk ;)). Doom 3 op de pc bijvoorbeeld, bestond uit drie cd's, terwijl het ook op één dvd had gekund. Ik denk dat de voordelen van de grotere opslagcapaciteit hem met name gaat zitten in lossless surround (Dolby TrueHD) en grotere c.q. uitgebreidere games. Maar goed, het blijft sterk afhankelijk van wat de developers ermee gaan doen. Er ligt in ieder geval een mooie toekomst klaar.

    ATI heeft een contract moeten tekenen om de techniek gebruikt in de Xbox 360 één jaar lang niet te mogen gebruiken voor de pc markt :). Dus in principe zullen we in de loop van 2007 de eerste videokaarten gaan zien, voorzien van features die we ook kennen van de Xbox 360.

    Het is inderdaad niet de versie (final build) die wij als consument straks kunnen kopen.

    Aha, oké :)
    Door de Unified Shader architectuur kunnen ze die 50MHz inderdaad ruim compenseren.
     
  3. SpatZuiver

    SpatZuiver Active Member

    Berichten:
    36
    Leuk Bevonden:
    0
    Mensen, jullie zijn allemaal van mening dat de Xenon (Xbox360 CPU) ongeveer even krachtig is als de Cell. Laat mij dan maar het onomstotelijk bewijs zien dat dat zo is. Voor bewijs dat de Cell krachtiger is kun je hieronder even lezen(moet je het wel ff begrijpen natuurlijk).

    The Cell Processor:
    The Cell inside the Playstation 3 is an 8 core asymmetrical CPU. It consists of one Power Processing Element(PPE), and 7 Synergistic Processing Elements(SPE). Each of these elements are clocked at 3.2GHz and are connected on a 4 ring Element Interconnect Bus(EIB) capable of a peak performance of ~204.8GB/s. Every processing element on the bus has its own memory flow controller and direct memory access (DMA) controller. Other elements on the bus are the memory controller to the 256MB XDR RAM, and the Flex phas i/o controller(FlexIO).

    The FlexIO bus is capable of ~60GB/s bandwidth. Massive chunk of this bandwidth is allocated to communicate with the RSX graphics chip, and the remaining bandwidth is where the southbridge elements lie such as sound, optical media(blu-ray/dvd/cd), network interface card, hard drive, USB, memory card readers, Bluetooth devices(controllers), and WiFi. This may sound like a lot to share with the RSX, but consider that aside from the RSX, the other components are using bandwidth in the MB/s scale, not GB/s, so even if add all of them up there is still plenty of bandwidth left.

    I actually recommend you skip down to the Xbox360 hardware comparison and look at the Cell and Playstation 3 hardware diagrams before you continue reading so you get a better idea of how things come together on the system as I explain it.

    Power Processing Element:
    The PPE is based on IBM’s POWER architecture. It is a general purpose RISC(reduced instruction set) core clocked at 3.2GHz, 16kb L1 instruction cache and 16kb L1 data cache, with a 512kb L2 cache. It is a 64-bit processor with the ability to fetch four instructions and issue two in one clock cycle. It is also able to handle two hardware threads. It comes with a VMX-128 vector unit with 32 register. The PPE is an in-order processor with delayed execution and limited out-of-order support for load instructions.

    PPE Design Goals:
    The PPE is designed to handle the general purpose workload for the Cell processor. While the SPEs are capable of executing general purpose code, they are not the best suited to do so. Compared to Intel/AMD chips, the PPE isn’t as fast for general purpose computing considering its in-order architecture and comparably less complex branch prediction hardware. This likely will prevent the Cell from replacing or competing with Intel/AMD chips on desktops, but in the console and multimedia world, the PPE is more than capable in terms of keeping up with the general purpose code used in games and household devices. Playstation 3 will not be running MS word.

    The PPE is also simplified to save space and improve power efficiency with less heat dissipation. This also allows the processor to be clocked at higher rates. To compensate for some of the hardware shortcomings of the PPE, IBM is an effort to improve compiler generated code to utilize better instruction level parallelism. This would reduce the penalties of in order execution.

    The VMX-128 unit on the PPE is actually a SIMD unit. This gives the PPE some vector processing ability, but as you’ll read in the next section; the SPEs are better equipped for vector processing tasks. The vector unit on the PPE is probably there in case a task that is better run on the PPE has some vector computations needed, but doesn’t perform overall better if the task was being done on an SPE, or if the specific chunk of work had to be handed off to an SPE, it bring in the

    Synergistic Processing Element and the SIMD paradigm:
    The SPEs on the Cell are the computing powerhouses of the Cell processor. They are independent vector processors running at 3.2GHz. A vector processor is also known to be a single instruction multiple data (SIMD) processor. This means that for a single instruction, let’s say addition, that operation can be performed in one cycle using more than one operand, effectively adding pairs, triples, quadruples of numbers in one cycle instead of taking up 4 cycles in sequence. Here is an example of the different approaches to an example problem of adding the numbers 1 and 2 together, 3 and 4, 5 and 6, and 7 and 8 to produce 4 different sums.
    On a traditional desktop CPU (scalar), the instructions are handled sequentially.
    Code:

    1. Do 1 + 2 -> Store result somewhere
    2. Do 3 + 4 -> Store result somewhere
    3. Do 5 + 6 -> Store result somewhere
    4. Do 7 + 8 -> Store result somewhere


    On a vector/SIMD CPU (superscalar) the instruction is issued once, and executed simultaneously for all operands.
    Code:

    1. Do [1, 3, 5, 7] + [2, 4, 6, 8] -> Store result vector [3, 7, 11, 15] somewhere


    You can see how SIMD processors can outdo scalar processors by an order of magnitude when computations are parallel. The situation does change when the task isn’t parallel like in the case of adding a chain of numbers like, 1 + 2 + 3. Quite simply, a processor has to get the result of 1 + 2, before adding 3 to it and nothing can avoid the fact that this operation will take 2 instructions that cannot occur simultaneously. Just to get your mind a bit deeper into this paradigm, consider 1 + 2 + 3 + 4 + 5 + 6 + 7 + 8. On the surface, you might count 7 operations are necessary to accomplish this problem assuming the sums have to be calculated before moving forward. However, if you try to SIMD-ize it, you would realize that this is actually still only 3 operations. Allow me to walk you through it:
    Code:

    1. Do [1, 3, 5, 7] + [2, 4, 6, 8] -> Store result in two vectors [SUM1, SUM2, 0, 0] and [SUM3, SUM4, 0, 0;
    2. Do [SUM1, SUM2, 0, 0] + [SUM3, SUM4, 0, 0] -> Store result in two vectors [SUM5, 0, 0, 0]; [SUM6, 0, 0, 0].
    3. Do [SUM5, 0, 0, 0] + [SUM6, 0, 0, 0] -> Store result in vector.


    Careful inspection of the previous solution would show two flaws. One is the optimization issue of parts of the vector not being used for the operation. Those used parts of the vector could have been used to perform operations useful for other parts of the program. It would be a huge investment on time if developers tried to solve this problem manually by filling vectors where their code isn’t already plainly vector based. That type of thing IBM is placing on compilers to be able to look into the code for parallelism – specifically instruction level parallelism (ILP).

    The other huge problem (which I know is there but know less about), is in the fact that vector processors probably naturally store results in a single vector. It would require some interesting misaligned calculations, shifts and/or copies of data to place the results in a position where they are ready to perform the next step. I am not too well versed in how this can be accomplished or if the SPEs have the ability to do something like this so I’ll leave it up to further discussion. Upon further research, “vector permute/alignment” seems to be the topic that address this problem. It seems the SPE instruction set down allow for inter-vector operations. Dot products, are one instruction.

    The SPE inside of the Playstation 3 sports a 128*128bit register file (128 registers, at 128bits each), which is a lot of room to also unroll loops to avoid branching. At 128 bits per register, this means that an SPE is able to perform operations on 4 operands 32bits wide each. Single precision floating point numbers are 32 bits which also explains why Playstation 3 sports such a high single precision floating point performance. Double precision floating point numbers are 64-bits long and slows the processing down an order of magnitude because only 2 operands can fit inside a vector, and I’m pretty sure it also breaks the SIMD processing ability since no execution unit can work on 2 double precision floating points at the same time, meaning that the SPE will perform double precision computations in a scalar fashion.

    Quote:
    “An SPE can operate on 16 8-bit integers, 8 16-bit integers, 4 32-bit integers, or 4 single precision floating-point numbers in a single clock cycle.”
    – Cell microprocessor wiki. That matches up with my prediction pretty much, but I haven’t been able to find any other sources that suggest or state this. It is a very logical explanation.

    The important thing to note is that vector processing, and vector processors are synonymous with SIMD architectures. Vectorized code, is best run on a SIMD architecture and general purpose CPUs will perform much worse on these types of tasks.

    SIMD Applications:
    Digital signal processing (DSP), is one of the areas where vector processors are used. I only bring that up because *you know who* would like to claim that it is the only practical application for SIMD architectures.

    3D graphics are also a huge application for SIMD processing. A vertex/vector(term used interchangeably in 3D graphics) is a 3D position, usually stored with 4 elements. X, Y, Z, and W. I won’t explain the W because I don’t even remember exactly how it’s used myself, but it is there in 3D graphics. Processing many vertices would be very slow on a traditional CPU which would have to individually process each element of the vector instead of processing the whole thing simultaneously. Needless to say, GPUs most definitely have many SIMD units (possibly even MIMD), and is why they vastly out perform CPUs in this respect. Operations done on the individual components of a vector are independent which makes the SIMD paradigm an optimal solution to operate on them.

    To put this in context, I don’t know if any of you remember 3D computer gaming between low end and high end computers between 1995 and 2000. Although graphics accelerators were out, some of them didn’t have “Hardware T&L”(transform and lighting). If you recall games that had the option to turn this on or off (assuming you had it in hardware), you could see the huge speed difference if it was done in hardware vs not. The software version still looked worse after they generally tried to hide the speed difference by using less accurate algorithms/models. It is this type of situation, the Cell is actually equipped to do relatively well, and traditional scalar CPUs would still perform vastly worse.

    It is worthwhile to note that “hardware” in the case of 3D graphics generally refers to things done on the GPU, and “software” just means it is running on the CPU – even though they are both pieces of hardware executing the commands in the end. Software just refers to the part that is controlled by the software the programmer writes.

    There are image filters algorithms that occur in applications like Adobe Photoshop which are better executed by vector processors too. Many simulations that occur on super computers are better suited to run on SPEs (toned down in accuracy appropriate for gaming). Some of these simulations include cloth simulation, terrain generation, physics, and particle effects.

    SPE Design Goals – no cache, such small memory, branch prediction?
    The SPEs don’t have a cache in the traditional sense of it being under hardware control. It uses 256kb of on-chip, software controlled SRAM. It reeks of the acronym “RAM” but offers latency similar to those of a cache and in fact, some caches are implemented using the exact same hardware – for all practical purposes, this memory is a controlled cache.

    Having this memory under software control places the work on the compiler tools, or programmers to control the flow of memory in and out of the local store. For games programming, this is actually generally the better approach if performance is a high priority. Traditional caches have the downside of being non-deterministic for access times. If a program tries to access memory that is in discovered in cache(cache-hit), the latency is only around 5-20 cycles and not much time is lost. If the memory is not discovered in cache(cache-miss), the latency is in the hundreds of cycles. This variance in performance is very undesirable in games as steady frame rates are much more visually pleasing than variable ones.

    IBM is placing importance on compiler technology to manage the local storage well unless the application wishes to take explicit control of this memory themselves (which higher end games will probably end up doing). If it is accomplished by compilers, then to a programmer, that local storage is a cache either way since they don’t have to do anything to manage it.

    The local storage is the location for both code and data for an SPE. This does make the size seem extremely limited but rest assured that code size is generally small, especially with SIMD architectures where the data size is going to be much larger. Additionally, the SPEs are all connected to other elements at extremely high speeds through the EIB, so the idea is that even though the memory is small, data will be updated very quickly and flow in and out of them. To better handle that, the SPE is also a VLIW processor that can dual can dual-issue instructions to an execution pipe, and to a load/store pipe. Basically, this means the SPE can simultaneously perform computations on data while loading new data and moving out processed data.

    The SPEs have no branch prediction except for a branch-target buffer(hardware), coupled with numerous branch hint instructions to avoid the penalties of branching through software controlled mechanisms. Just to be clear right here – this information comes from the Cell BE Programming Handbook itself and thus overrides the numerous sources that generally have said “SPEs have no branch prediction hardware.” It’s there, but very limited and is controlled by software and not hardware, similar to how the local storage is controlled by software and is thus not called a “cache” in the traditional sense.

    How the Cell “Works”:
    This could get very detailed if I really wanted to explain every little thing about the inner workings of the Cell. In the interest of time, I will only mention some of the key aspects so you may get a better understanding of what is and isn’t possible on the Cell.

    There are 11 major elements connected to the EIB in the Cell. They are 1 PPE, 8 SPEs, 1 FlexIO controller, and 1 memory controller. In the setup for the Playstation 3, one SPE is disabled so there are only 10 operational elements on it. When any of these elements needs to send data or commands to another element on the bus, it sends a request to an arbiter that manages the EIB. It decides what ring to put the data on, and when to do it to efficiently distribute resources and avoid contention. With the exception of the memory controller (connected to RAM), any of the elements on the EIB can make requests to read or write data from other elements on the EIB. IBM has actually filed quite a number of patents on how the EIB works alone to make the most efficient use of its bandwidth. The system of bandwidth allocation does breakdown in detail, and in general, I/O requests are handled with the highest priority.

    Each processing element on the Cell has its own memory controller. For the PPE, this is transparent since it is the general purpose processor. A load/store instruction executed on the PPE will go through L2 cache and ultimate make changes to the main system memory without further intervention. Underneath the hood though, the memory controller the PPE sets up a request to the arbiter of the EIB to send its data to the memory controller of the system memory. This event is transparent to the load/store instruction on the PPE so that RAM is its main memory. The SPEs are under a different mode of operation. To the SPEs, a load/store instruction works on its local storage. The SPE has its own memory controller to access system RAM just like the PPE, but it is under software control. This means that programs written for the SPE have to set up manual requests on their own to read or write to the system memory that the PPE primarily uses. The messages could also be used to send data or commands to another element on the EIB.

    This is important to remember because it means that all of the elements on the EIB have equal access to any of the hardware connected to the Cell on the Playstaiton 3. Rendering commands could come from the PPE or and SPE seeing as they both have to ultimately send commands and/or data to the I/O controller which is where the RSX is connected. On the same idea, if any I/O devices connected through FlexIO have a need to read or write from system memory, it can also send messages directly to the XDR memory controller, or send a signal to the PPE or an SPE instead.

    The communication system between elements on the Cell processor is high advanced and planned out and probably constitutes a huge portion, if not most, of the research budget for the Cell processor. It allows for extreme performance and flexibility for whoever develops any kind of software for the Cell processor. There are several new patents IBM has submitted that relate to transfers over the EIB and how they are setup alone. After all, as execution gets faster and faster, the general problem is having memory keeping up to speed.

    Note: The section is extremely scaled down and simplified. It is to the point where if you read the Cell BE Handbook, you could say I’m wrong in many places if I implied or suggested that only one method or communication is possible or if you use my literal word choice against theirs. If you are wondering how something would or should be accomplished on the Cell, you’d have to dive deeper into the problem to figure out which method is the best to use. The messaging system between elements on the EIB is extremely complex and detailed in nature and just couldn’t be explained in a compact form.

    Multithreading?
    Threading is simply a word used to describe a sequence of execution. Technically, a single core CPU can handle infinite threads. The issue is that performance drops at a certain point depending on what the individual tasks are doing. The PPE has two threads on the same processor. This makes communication between these two threads easier since they are using the exact same memory resources. Sharing data between these threads is only an issue of using the same variables in code and keeping threads synchronized – much of which has been done and thoroughly studied.

    On the other hand, the SPEs are more isolated execution cores that have their own primary memory which is their local store. Sharing data between SPEs and the PPE means putting data on the EIB, which means that one of the messaging methods has to be used to get it there. There are various options for this depending on what needs to be transferred and how both ends are using the data. Needless to say, synchronization between code running on SPEs and the PPE is a harder problem. It is better to think of the code running on separate SPEs as separate programs rather than threads to scale the synchronization and communication issues appropriately.

    That being said, it isn’t a problem that hasn’t been seen before as it is pretty much the same as inter-process communication between programs running on an operating system. Each application individually thinks it has exclusive access to the hardware. If it becomes aware of other programs running, it has to consider how to send and receive data from the other application too. The only added considerations on the Cell are the hardware implementation details of the various transfers to maximize performance even of more than one method works.

    Programming Paradigms/Approaches for the Cell:
    Honestly, the most important thing to mention here is that the Cell is not bound to any paradigm. Any developer should assess what the Cell hardware offers, and find a paradigm that will either be executed fastest, or sacrifice speed for ease of development and find a solution that’s just easy to implement. That being said, here are some common paradigms that come up in various sources:

    PPE task management, SPEs task handling:
    This seems to be the most logical to many due to the SPEs being the computational powerhouse inside of the Cell while the PPE is the general purpose core. The keyword is computational which should indicate that the SPEs are good for computing tasks, but not all tasks. Tasks in the general purpose nature would perform better on the PPE since it has a cache and branch prediction hardware – making coding for it much easier without having to control those issues. Limiting the PPE to dictating tasks is stupid if the entire task is general purpose in nature. If the PPE can handle it alone, it should do so and not spend time handing off tasks to other elements. However, if the PPE is overloaded with general purpose tasks to accomplish, or has a need to certain computations which the SPEs are better suited for, it should hand it off to an SPE as the gain in doing so will be worthwhile as opposed to being bogged down running multiple jobs that can be divided up more efficiently.

    Having the PPE fill a task manager role may also means that all SPEs report or send its data back to the PPE. This has a negative impact on achievable bandwidth as the EIB doesn’t perform as well when massive amounts of data are all goin to a single destination element inside the Cell. This might not happen if the task the elements are running talk to other elements including external hardware devices, main memory, or other SPEs.

    SPE Chaining:
    This solution is basically using the SPEs in sequence to accomplish steps of a task such as decoding audio/video. Basically, an SPE sucks in data continuously, processes it continuously, and spits it out to the next SPE continuously. The chain can utilize any number of SPEs available and necessary to complete the task. This setup is considered largely due to the EIB on the Cell being able to support massive bandwidth, and the fact that the SPEs can be classified as an array of processors.

    This setup doesn’t make sense with everything as dependencies may require that data revisit certain stages more than once and not simply pass through once and be done. Sometimes, due to dependencies a certain amount of data has to be received before processing can actually be completed. Lastly, various elements may not produce output that a strict “next” element needs. Some of it may be needed by one element, and more to another.

    CPU cooking up a storm before throwing it over the wall:
    This honestly was a paradigm I initially thought about independently early into my research on the details of the Cell processor. It’s not really a paradigm, but rather is an approach/thought process. Even the Warhawk designer/producer mentioned an approach like this The Cell is a really powerful chip and can do a lot of computational work that is very fast inside the processor. The problem is bandwidth to other components outside of the chip bring in communication overheads and those bottlenecks as well. It seems like a less optimal use of computing resources if the PPE on the Cell writes output to memory, and all of the SPEs pick up work from there if the PPE can directly send data to the SPEs, removing the bottleneck of them all sharing the 25.6GB/s bandwidth to system memory. It appears to make the most sense to let the Cell load and process the game objects as much as possible, before handing it off to the RSX or writing back to memory.

    This approach does make sense, but by no means is a restriction if a game has serious uses and demands for a tight relationship between the RSX or other off chip elements and Cell throughout the game loop.

    Where does the operating system go?
    Some sources propose that an operational SPE will be reserved by Sony for the operating system while games are running. As far as I researched, I have found nothing official to support this being the case with PS3 other than Ken Kutaragi saying an OS could run on an SPE, and IBM’s papers suggesting various Cell operating system configurations.

    The specific configuration of running an OS(kernel only) on an SPE makes sense from a security perspective. I will not explain it in this post, but the Cell does have a security architecture which can enable an SPE to be secured through hardware mechanisms. Given this ability, if Sony wanted an easy method to protect its operating system from games and homebrew, then they would probably resort to running a kernel with light OS features in an SPE.

    Otherwise, the short answer is that the OS could run as a tiny thread on the PPE, or on an SPE. Sony will do what has the least impact on gaming and still delivers on the functional requirements of the OS


    --------------------------------------------------------------------------------
    Playstation 3 and Xbox360 – Comparing and Contrasting:
    Before I compare and contrast with the Xbox360 hardware, here are some quick facts about the Xbox360 hardware:
    Xbox360 Quick Hardware Summary:
    The Xbox360 has a tri-symmetrical core CPU. Each one of the cores is based on the POWER architecture like the PPE inside the Cell, and is clocked at 3.2GHz. Each core has 32kb L1 instruction and 32kb LI data cache, and has a 1MB shared L2 cache. Each chip also sports an enhanced version of the VMX-128 instruction set and execution units. This enhanced version expands the register file from 32 128-bit registers, to a pair of 128 128-bit registers – with one execution unit per core. Each of these cores can also dual-issue instruction and handles two hardware threads, bringing the Xbox360 thread total to 6 hardware threads. The CPU and GPU share 512MB of GDDR3 RAM. Xbox360’s GPU, codenamed “Xenos” is designed by ATI and sports 48 shader pipelines using a unified shader architecture. The Xbox360 GPU also has 10MB of eDRAM for the frame buffer and over 200GB/s of bandwidth between this eDRAM and a simple logic uni, for a limited set of 3D processing effects such as anti-aliasing and z-buffering.

    The system sports a DVD9 optical media drive from which games are loaded, a controller with rumble features, and 100mbps Ethernet.

    Head To Head:
    General Architecture Differences:
    One thing I think is important when looking at CPU architecture is visuals. In the world of computing, physical distance between parts of a computer system generally corresponds with the speed (latency-wise) of their communication. Also a diagram shows the flow of memory, outlining where bottlenecks might exist for certain components to access large amounts of data from specific areas of memory.

    Here are two diagrams of the major components on the Xbox360 motherboard:




    Here are two diagrams of the Xenon CPU:



    Comparably it is harder to find verbose diagrams of PS3 hardware but here is one I found on AnandTech:

    This diagram has a likely discrepancy relating southbridge (I/O) being connected through the RSX. It is likely the southbridge will connect to the Cell directly via Flex I/O given the large bandwidth available through the interface and the GPU not being a recipient of I/O.


    There are plenty of other Cell diagrams on the internet and here are two of them:



    Bandwidth Assessment:
    I recall an article IGN released short after or during E3 2005 comparing Playstation 3 and Xbox360. Microsoft analyzed their total system bandwidth in the Xbox360 and came up with some outrageous numbers compared to the Playstation 3. One of the big reasons for this total number being higher is the 256GB/s bandwidth between the daughter die and parent die in the Xenos(graphics chip). I will explain the use of the eDRAM memory later, but it is important to know that the logic performed between those two components with 256GB/s bandwidth hardly constitutes a system component where considering game processing takes place. Additionally, Microsoft added up bandwidths that weren’t relevant to major component destinations such as “to CPU” or “to GPU.” Context like that matters a lot, because bandwidth between any two elements is only as fast as the slowest memory bus in-between. The only bandwidth figures that make sense to add together are those on separate buses to the end destination.

    The biggest ugly (and this really is a big one) in the Xbox360 diagram should be the location of the CPU relative to the main system memory. It has to be accessed through the GPU’s memory controller. The Xbox360 GPU’s memory has 22.4GB/s bandwidth to the system’s unified memory, and this bandwidth is split between the GPU’s needs and the CPU’s. A simple investigation would show that if the Xenon(Xbox360 CPU) was using its full 21.6GB/s bandwidth to system memory, there would be 800MB/s left for the GPU. If the GPU was using it’s full bandwidth to this memory, none would be left for anything else. Additionally, the southbridge(I/O devices) is connected through the GPU also, and all of these devices are actually destined to go to the CPU unless sound for the Xbox360 is done on the Xenos. The impact of this is considerably less since I/O devices probably won’t exceed more than a few hundred MB/s during a game, and isn’t shared by GPUs 22.4GB/s access to main memory. This bandwidth is still going through the same bus that the CPU uses to access RAM though.

    Looking at the diagram of the Playstation 3, you can see that the RSX has a dedicated 22.4 GB/s to its video memory, and the Cell has a dedicated 25.6GB/s to its main memory. Additionally, if you wanted to find the bandwidth the RSX could use from the Cell’s main memory, it go through the 35GB/s link between the Cell and itself, and then go through the Cell processor’s FlexIO controller, on the EIB, to the Cells memory controller which is the gatekeeper to RAM. The slowest link in the line is the bandwidth the XDR memory controller provides which is 25.6GB/s. If the RSX uses this extra bandwidth it is being shared with the Cell. In general though, the major components in the Playstation 3 have their own memory to work with which provides maximum bandwidth.

    In terms of peak performance, if both the GPU and CPU for both consoles were pushing the maximum bandwidths from their respective memory banks, the total for Xbox360 would be 22.4GB/s, and the total for the Playstation 3 would be 48GB/s. I believe this to be the most important bandwidth measure as both of these elements are the major programmable elements of a gaming machine. They will be processing game data or graphics data independently, and need fast access and high bandwidth to what they are working on.

    While the Xbox360 shared bandwidth is a big downside on the grand scheme of things considering potential, Microsoft probably allowed this due to the nature of a game loops often not involving both the CPU and GPU needing high bandwidth simultaneously. Overall, during a game loop, Xbox360 will probably use its 22.4GB/s bandwidth almost constantly due to the CPU using it heavily for a part of the game loop, and the GPU using extreme bandwidth during another part of the game loop. While a Playstation 3 game, if it uses a typical game loop design, would show half of the frame time, the CPU is using high bandwidth to its memory, the other half being mostly unused; and the same thing for the GPU’s use of video RAM. That isn’t a disadvantage of the Playstation 3’s part, but it is a lack of using its full potential. A modified game loop that kept both rendering and CPU processing high would fare far better on the Playstation 3’s bandwidth and design than the Xbox360.

    In the worst case scenario for the Playstation 3, if the GPU literally only used bandwidth for half of the game loop, overtime, you could consider it’s bandwidth to be half of its peak. Same thing applied to the Cell and XDR RAM would yield 12.8GB/s bandwidth if it only used XDR half of the time. Although Playstaiton 3 not to be outdone - if the situation of a game loop is like this, the RSX might as well take the XDR RAM bandwidth while the CPU is idling and increase its total bandwidth to 48GB/s.

    Xbox360 “Xenon” compared to Playstation 3’s “Cell” – the CPUs:
    Inter-core communication speed:
    Another mystery with the Xbox360 (at least in my view) exists with the inter-core communication on the Xenos CPU between its cores. IBM clearly documents the Cell’s inter-core communication mechanism physically and how it is implemented in hardware and software. This bandwidth needs to be extremely high if separate cores need to communicate and share data effectively. The EIB on the Cell is documented at a peak performance of 204GB/s with an observed rate at 197GB/s. The major factor that affects this rate is the direction, source, and destination of data flow between the SPE and PPEs on the Cell. I tried to find out the equivalent piece of hardware inside the Xenon CPU and haven’t found a direct answer.

    Looking at the second architectural diagram of the Xenon, it seems that the fastest method the cores can use to talk to each other is through the L2 cache. Granted, the Xenon only has 3 cores, game modules are usually highly dependent and will need to talk to each other frequently. I might be a jumping the gun a bit, but given the L2 cache and FSB are running at half of the core speed, as opposed to the Playstation 3’s EIB which runs at the same clock speed as the cores, I’m pretty positive using L2 cache to communicate is not going to be very fast. It seems that independent threads are really what Microsoft was aiming for with the Xbox360 CPU design, and games are not optimally implemented if they have massive streaming transfers to hand off to other cores. What would suggest that the Xbox360 cores can communicate quickly and with high bandwidth, would be evidence that the reading and writing to the L2 cache are in larger segments than the writes to the EIB, compensating for the lower clock speed. Additionally, just writing to memory isn’t enough as the receiver needs some sort of notification that it has new data unless it is a permanent buffer. If anyone wants to do research on the topic, please add it to the discussion and include links to your sources.

    Enhanced VMX-128 instruction set:
    This is one of the features Microsoft boasts to claim they have a better gaming machine than Sony. They focus on the fact that their enhancements support a single cycle dot product instruction, and the larger register file. The problem with this boast over the Playstation 3 is that it compares it to the PPE’s VMX-128 unit which comparably only has 1 set of 32 128-bit registers and presumably less instructions. If the code requires 128 128-bit registers, or more complex instructions, then the code is most definitely vector processing heavy and should be run on an SPE which sports the exact same register file size, and includes a superset of the VMX instructions in terms of functionality(it is not a superset in terms of being binary compatible).

    While each core in the Xbox360 also has two VMX-128 register sets, this is done to support the dual threaded nature of the cores better. It doesn’t actually have two vector execution units. Each core only has one VMX-128 execution unit meaning that even though there are two sets of registers per core, two threads that are using vector code have to share this single execution unit.

    Comparably, the Cell’s PPE has the limited 32 128-bit register file with a single VMX vector unit on the PPE. This is what Microsoft usually singles out when they compare Playstation 3 to the Xbox360’s CPU. They forget(purposefully) that the Cell has 7 SPEs running at 3.2 GHZ, which is far greater SIMD performance than their 3 enhanced VMX-128 execution units. For vector based computations, the Playstation 3 undeniably outdoes the Xbox360 by an order of magnitude.

    The dot product instruction claim is matched at least on the SPEs on the Playstation 3 though a simple multiply-add instruction. For those of you that aren’t mathematically inclined, a dot product is basically a measure of how parallel or perpendicular two lines are. The calculation of a dot product is basically multiplying each corresponding dimension value together, and then taking those products and adding them all together. Take two vectors <2, 3, 4> and <6, 7, 8>. The dot product would be: 2*6 + 3*7 + 4*8 = 65. If you read the earlier section in this post covering the SPES and SIMD architectures, you should remember that at the very least, an SPE can do all of the multiplying in one cycle, and all that needs to be done is a follow up add between the elements in the result vector. I do know that the SPEs have a few multiply-add instructions, but the bit of haziness is if the multiply can be an intra-vector(between two separate vectors) operation, while the add instruction is an inter-vector(between elements in the same vector) instruction from the result of the multiply. Sony claims that the dot product can be done in one cycle on an SPE, and it is very reasonable that this is the case as there are vector permute/shuffles/shift instructions in the SPE instruction set. There just isn’t a labeled dot product instruction in the SPE instruction set – but an intelligent programmer should find what he needs.

    I found the multiply-add instruction in the Cell BE Handbook. It takes 4 vectors, one is definitely the result vector and two are operands, but the third parameter named ‘rc’, which I think represents a control register that dictates how to perform inter and intra vector operations. That means the multiply-add instruction has to operate on only two vectors, and the control vector is able to dictate an add between the result components of the multiply.

    Symmetrical Cores?:
    Symmetrical cores means identical cores. The appeal to this setup is entirely for developers. It represents no actual horsepower advantage over asymmetric cores since code running on any of the cores, will run exactly the same as it would run if it were on another core. Relocating code to different cores has absolutely no performance gain or loss unless it means something with respect to how the 3 cores talk to each other. It should be noted though, that thread relocation does matter between the cores, as a thread might not co-exist well with another thread that is trying to use the same hardware that isn’t duplicated on the core. In that case, the thread would be better located on a core that has that execution resource free or less used. The only case of this I can think of is the VMX-128 execution unit. I think most other hardware is duplicated on the cores in the 360 to allow for two threads to co-exist with almost no problem.

    The Cell chip has asymmetrical cores, which means they are not all identical. That being said, the SPEs are all symmetrical with each other and the code that runs on an SPE could be relocated to any other SPE in the Cell. While the execution speed local to the SPEs are the same, there are performance issues related to the bandwidth the SPE is using and who it’s talking to on the EIB. Developers should look at where their SPE code is executing to ensure optimal bandwidth is being observed on the EIB, but once they find an optimal location to execute the code on, they can just put it there without rewriting anything. If a task was running on the PPE or PPE’s VMX unit, then it would have to be recompiled with C, and probably rewritten if hardware specific instructions are in the code(C or ASM) before it moves to an SPE, and the same applies in reverse. Good design and architecture should immediately let developers know what should run on the PPE and what should run on the SPEs, eliminating the chance of rewriting code if they see something better fit to run on an SPE later in development.

    Is general purpose needed?:
    Another one of Microsoft’s claims for the Xbox360’s superiority in gaming is the general purpose processing advantage since they have 3 general purpose cores instead of 1.

    To say “most of the code is general purpose” probably refers to code size, not execution time. First, it should be clarified that “general purpose code” is only a label for the garden variety of instructions that may be given to hardware. On the hardware end, this code fits into various classifications such as arithmetic, load/store, SIMD, floating point, and possibly more. General purpose applications are programs made up of general purpose code on the scale that one function might be arithmetically heavy, and another might be memory bound. Good examples of this are MS Word, a web browser, or an entire operating system. With MS Word there is a lot of string processing which involves some arithmetic, comparison, a lot of branching, and memory operations. When you click import or export and save to various file formats, it is an I/O heavy operation. Applications like these tend to not execute the same code over an over, and have many different functions that can occur on relatively a small set of data depending on what the user does. These functions can vary from being very I/O device bound (saving to disk), to string processing intensive (spelling/grammar check), to floating point intensive(embedded Flash media game or resizing an image). Ultimately, there is a large amount of code written to handle the small set of data and most of it never gets executed.

    Games are not general purpose programs. Any basic game programming book will introduce you to the concept of a game loop. This loop contains all of the functionality a game performs each frame. This loop handles all of the events that can occur in the game. An important principle in a game loop is to avoid branches when unnecessary as it slows down execution and makes the code on screen extremely and unnecessarily long. A good example of this is the Cohen-Sutherland line clipping algorithm. Instead of writing lengthy and complicated branches to check the 9 regions a point lies in, the code performs 4 simpler checks, and computes a region code which can be easily be used.

    This automatic and repetitive processing has to occur for many game objects which represents a massive amount of data, with a relatively small code size. This is opposite of the general purpose paradigm, which typically has a small set of data (word document or html) and performs many various functions on it representing a large code size. Games processing has a large data size, but much smaller code size. Game objects also tend to be very parallel in nature as game objects are typically independent until they interact (collision) – which means they can be processed well on SIMD architectures if they are well thought out..

    The whole integer advantage claim for the Xbox360 CPU is pretty stupid considering the SIMD architectures can operate on 4 32-bit integers at the same time, and integer processing abilities of games are not the bottleneck of 3D games processing.

    What this general purpose power does grant Xbox360 owners over Playstation 3 is the ability to run general purpose applications faster. If the Xbox360 had a web browser(official or not), the design for such an application would work better on a general purpose CPU(s). That being said, it’s too bad Xbox360 doesn’t come with one, and web browsers don’t put the highest demand on general purpose processors to begin with. Most general purpose applications remain idle until the user gives actually input. The application will then process the task and complete before sitting idle again.

    AI routines that navigate through large game trees are probably another area where general purpose processing power might be better utilized since this code tends to be more branch laden and varying depending on the task the AI is actually trying to accomplish. The plus side for the Playstation 3 is generating of these game trees, which is also time consuming. Generating a game tree is a more computational oriented task, and is likely to be executed faster by SIMD architectures. I am largely speaking speculatively under my Computer Science knowledge in this area. Anyone who knows more or has done more research on AI algorithms is welcome to add to discussion in this area.

    The only case I can really see the general purpose computing power of the Xbox360 cores manifesting itself as a true advantage over the Playstation 3, is if Windows or similar OS was put on an Xbox360, having multiple applications running simultaneously along with some background services. Again, it is funny that Playstation 3 is more likely to have a general purpose operating system running on it than Xbox360 even though it would perform worse doing such a task.

    XDR vs GDDR3 – System Memory Latency:
    XDR stands for eXtreme Data Rate while GDDR3 stands for Graphics Double Data Rate version 3. XDR RAM is a new next generation RAM technology from those old folks at Rambus, who brought out that extremely high bandwidth RDRAM back during the onset of Pentium 4 processors. DDR was released soon after and offered comparable bandwidth at a much lower cost. RDRAM also had increased latency, higher cost, and a few other drawbacks which ultimately led to it being dropped very quickly by Intel back when it was released. Take note that DDR RAM is not the same as GDDR RAM.

    Anyways, it was hard to make a good assessment on what the exact nature of the performance difference between these two RAM architectures are, but from what I gathered, GDDR3 is primarily meant to serve GPUS which means bandwidth is the goal of the architecture, at the cost of increased latency. For GPUs this is accepatable since, large streaming chunks of data are being worked on instead of small random accesses. In the case of CPU main memory, when more general purpose tasks are being performed, latency has increased importance on memory access times because data will be accessed at random more frequently than a GPU would.

    That being said, the Xbox360’s CPUs bandwidth to RAM tops out at 21.6GB/s while the Cell processor still has more bandwidth to its RAM at 25.6GB/s. XDR RAM also does this without incurring high latency, and I’m almost positive its latency is lower than GDDR3 which is considered to actually have high latency. Games are not going to be performing a lot of general purpose tasks so the latency advantage for the Playstation 3 might not be that large, but the CPU will be performing more random accesses to memory regardless. The Xbox360’s CPU latency may be made worse than the already inherent GDDR3 latency issues due to being separated by the GPU.
     
  4. Master Perfect

    Master Perfect Active Member

    Berichten:
    116
    Leuk Bevonden:
    0
    Dat ga ik nu is allemaal niet lezen.
    Of de cell nu sterker is dan de xenon maakt eigenlijk weinig uit. Het gaat er vooral om hoe je die kracht kunt benutten. En op dat punt heeft de xenon een reusachtig voordeel t.o.v. de cell. De kracht van de xenon is veel makkelijker aan te spreken dan die van de cell.
    En het is niet enkel de processor die bepaalt hoe goed games er gaan uitzien. Alles hangt er ook vanaf waar de "bottleneck" van het systeem zich bevindt.
    Wat ben je immers met een ferrari als je banden maar 180km/u aankunnen?
     
  5. SpatZuiver

    SpatZuiver Active Member

    Berichten:
    36
    Leuk Bevonden:
    0
    Maar dat de Cell krachtiger als de Xenon is maakt weldegelijk uit. Want het mooie is dat de Cell zeer flexibel is en dat betekent dat je ermee doen kan wat je wilt. De Cell kan bijvoorbeeld ook een videochip bijstaan in het berekenen van 3D-beelden.

    Jij hebt het over bottlenecks, maar over welke bottlenecks heb jij het dan????
     
  6. Xwon

    Xwon het is wat het is.

    Berichten:
    2.853
    Leuk Bevonden:
    1
    Het mag tegenwoordig toch wel aangenomen worden dat "Tha Cel!!"|:- Iets krachtiger is.....
    Maaaaaaaar.............. Ook mag je toch wel verwachten dat de gemiddelde forum bezoeker toch ook op de hoogte is van de vele geruchten over het feit hoe moeilijk het is om "Tha Cell!!"|:- Zn werk goed te laten doen.....
    Paar dagen geleden nog het bericht over hoeveel uitval er is in het productie procces van "Tha Cel!!"|:-

    Ik voorzie een groot fiasco de eerste paar jaar voor de PS3 met zo'n beetje iedere dag slecht weer berichten voor Sony.....

    En tja, dan kan men weer berichten dat het "slechts geruchten" zijn maar ......
    Waar rook is is vuur, zeker in het geval met Sony (of speelt er al iemand op een PS3 sinds de lente? ....... O nee, de geruchten zeiden dat ie niet uit zou komen maar Sony hield zelfs tot de laatste week vol dat hij zn launch zou halen..... :thumbs: )

    Geen launch 2007 in Europa :x
     
  7. Arjan

    Arjan XBW.nl VIP XBW.nl VIP

    Berichten:
    7.207
    Leuk Bevonden:
    3
    Je haalt zelf al een mogelijke bottleneck aan! De bandbreedte van de link tussen de RSX en het lokale videogeheugen is 22.4GB/sec. De bandbreedte van de RSX zelf is zo'n 35GB/sec. Dus de RSX moet sowieso al gebruik maken van de geheugencontroller van de Cell om dat verschil op te vangen. De downstream bandbreedte van de link tussen de Cell en de RSX bedraagt 15GB/sec. Deze wordt dus al aangesproken. Wanneer je de Cell dan ook nog gaat belasten met grafische instructies, komt je zeker met 1080p @60fps bandbreedte te kort.

    Het is niet voor niets dat Sony heeft besloten om nVidia in te schakelen voor het leveren van een grafische chip. De Cell was niet 'goed' genoeg voor dit werk. Het nadeel is dat de ontwikkelingstijd voor een hele nieuwe GPU te kort was, en heeft met dus het ontwerp van de G70 aangepast.
     
  8. SpatZuiver

    SpatZuiver Active Member

    Berichten:
    36
    Leuk Bevonden:
    0
    Arjan, heb je mijn post wel goed gelezen of begrijp je het niet?

    De RSX heeft een eigen geheugen van 256 MB met een bandbreedte van 22,4 Gb's. De Cell heeft een eigen geheugen van 256 MB met een bandbreedte van 25,6 Gb's. De Cell en de RSX kunnen met elkaar communiceren door een pipeline van RSX naar Cell van 20 Gb's en weer terug met 15 Gb's.

    Wanneer de RSX geen eigen bankbreedte meer heeft zou deze dus een deel bandbreedte kunnen krijgen van de Cell.

    Bij de Xbox 360 heb je een unified geheugen architectuur met een totaal bandbreedte van 22,4 Gb's. De GPU en CPU moeten dit dus delen. Je kunt ervan uitgaan dat de GPU de meeste bandbreedte opeist waardoor je dus een geheugenbandbreedte tekort krijgt voor de CPU. Dus als er een console een bottleneck heeft dan is het de Xbox 360 wel.

    Xwon, waar haal je dat iets krachtiger toch vandaan. Laat mij dat maar eens lezen dat het maar "iets" scheelt.

    Volgens vele, ondersteund met technische data, is het verschil eerder in de orde van 2 keer zo krachtig.
     
    Laatst bewerkt door een moderator: 17 jul 2006
  9. Xwon

    Xwon het is wat het is.

    Berichten:
    2.853
    Leuk Bevonden:
    1

    Tja, normaal ben ik altijd wel van de bronnen enzo maar dat echte technische gebeuren waar jullie je op dit moment op aan het richten zijn gaat me net ff iets te ver.....

    Dat "iets" lees ik in de vele berichten ("niets" zegt dus nog niet of dat twee keer zoveel is zoals jij claimt O-) )
    Meer een optelsom van de vele verhalen en de stoere verhalen van Sony maar die verhalen niet kunnen onderbouwen met mooie beelden of echt positieve reacties van ontwikkelaars.....

    Het voorbeeld van eerder nog ff aangehaald..... "Wat ben je immers met een ferrari als je banden maar 180km/u aankunnen?"
     
  10. CaptnCook

    CaptnCook Active Member

    Berichten:
    8.478
    Leuk Bevonden:
    10
    Dat de PS3 op papier sneller zou zijn, is allemaal wel leuk (alhoewel, Sony's specificaties veranderen altijd tig keer, dus je weet niet welke specs je moet pakken), maar is het ook echt in de praktijk bewezen dat de PS3 zoveel krachtiger is? Zijn de games zoveel mooier?

    Antwoord is 2x nee. De getoonde PS3 games (de games die werkelijk ingame waren, een groot deel was/is nog steeds niet ingame) overtreffen de Xbox 360 games niet. Zelfs games als MGS4 en Heavenly Sword die zeer waarschijnlijk 2e generatie games worden, blazen bijvoorbeeld Gears of War (een 2e generatie Xbox 360 game niet weg).

    Misschien word het tijd om de Sony bril eens af te zetten en de werkelijkheid te zien.
     
    Laatst bewerkt: 17 jul 2006
  11. Xwon

    Xwon het is wat het is.

    Berichten:
    2.853
    Leuk Bevonden:
    1

    Wat hij zegt dus ;)
     
  12. SpatZuiver

    SpatZuiver Active Member

    Berichten:
    36
    Leuk Bevonden:
    0
    Ik claim helemaal niks ik post gewoon wat ik gelezen heb.

    Verder moet je pas een oordeel geven over grafische kwaliteit wanneer je die games zelf gespeeld hebt.

    Het feit is gewoon dat de Xbox 360 een bottleneck heeft die je niet onder stoelen of banken kunt schuiven. Waar ze bij de Xbox 1 nog een voordeel had met unfied geheugen is het een bottleneck bij de next-gen.
     
  13. Xwon

    Xwon het is wat het is.

    Berichten:
    2.853
    Leuk Bevonden:
    1
    Een belangrijker feit is in het geheel dat Sony eerst iets te bewijzen heeft terwijl MS nog geen bottleneck van formaat is tegengekomen....
    De grootste "bottleneck van Sony zit hem in de marketing praatjes en het feit dat ze denken van de PS3 een succes te kunnen maken omdat ze zo succesvol zijn met PS1 en 2...... :lol: Ik heb in iedergeval nog helemaal NIETS gezien of gehoord dat anders bewijst..... DAT is de bottleneck :b:
     
  14. CaptnCook

    CaptnCook Active Member

    Berichten:
    8.478
    Leuk Bevonden:
    10
    Waarom zeg je dan dat de PS3 games mooier zullen zijn? Heb jij ze al gespeeld dan? :{
    Hoe weet jij dat de PS3 de Xbox 360 zal gaan overklassen? Heb je ze allebei al naast elkaar staan. Wat op papier staat is leuk, maar het is niet de realiteit. Het gaat om het werkelijke totale eindpakket en wat de developers ermee doen / ermee kunnen.

    Zo heeft in de formule 1 Williams een van de krachtigste motoren. Leiden zij het kampioenschap? Nee. Waarom niet? Het totaalpakket is niet goed. Ook halen de coureurs er vaak niet het maximale uit.

    Op papier kan het nog zo mooi zijn, dat moest de emotion engine ook zijn, de realiteit was / is echter anders.
     
    Laatst bewerkt: 17 jul 2006
  15. MacDennis

    MacDennis Active Member

    Berichten:
    1.223
    Leuk Bevonden:
    0
    Helemaal mee eens. Dat geneuzel over TFLOPS en GigaBitten per seconde over een console waarvan niemand nog een uiteindelijk exemplaar van heeft gezien is slaapverwekkend.

    Kijk eens naar de in-game screenshots en filmpjes om een idee te krijgen wat de PS3 kan. Wat ik vaak zie zijn low-res textures en low-polygon modellen. Waarom is dat? Een mooi voorbeeld is Coded Arms ..

    http://media.ps3.ign.com/media/826/826975/img_3725510.html
    http://media.ps3.ign.com/media/826/826975/img_3725504.html
    :eek:
     
  16. SpatZuiver

    SpatZuiver Active Member

    Berichten:
    36
    Leuk Bevonden:
    0
    Laatst bewerkt: 17 jul 2006
  17. 36O

    36O Guest

    tja...eerlijk gezegd ben ik het wel met arjan enzo eens......Het eerste wat Sony in eerste instantie wou met de CELL is ëen grote superkrachtige superCPU (met geen ëen nadeel, alleen maar voordelen) crëeren die zowel als CPU als GPU moeten/kunnen dienen!
    Wat blijkt? Sony heeft hier geen genoeg ervaring mee (emotion engine anyone?) en zelfs IBM had Sony nog 18 maanden voor de E3 van 2005 geadviseerd om alsnog aan te kloppen bij een GPU maker voor een GPU, wegens krachttekort van CELL (en niet te vergeten de talloze nadelen van de prossesor).
    Tja, en als zelfs een PRO bedrijf als IBM zo bezorgd praat over de CELL, tja, dan kan Sony dat niet negeren.
    Dus wat doen ze? nog maar 18 maanden voor de aankondiging van het apparaat kloppen ze nog ff bij Nvidia aan voor een Videoprosessor (toevallig Nvidia, omdat die namelijk nog vrij stond, MS was van Nvidia naar ATI overgelopen, weet iemand trouwens waarom?).
    En zo is dus de RSX onstaan, onder hevige tijdsgebrek en met een (bijna) identieke architecture als dat van de G70 (ook van Nvidia).
    Als je het mij vraagt ben ik niet zo onder de indruk van de RSX als dat ik ben van de Xenos.
    De RSX is namelijk meer bedoeld voor flouting point berekeningen (minder geschikt voor games), terwijl de Xenos van ATI technologies en Microsoft Hardware Group hard aan de haal gaat met aspecten als Unified Shader Architcture (en USA is nou juist wel weer geschikt voor games) en de 10 embedded DRAM (wat volgens ATI ook zeer handig van pas komt bij het renderen van games).
    De enige voordeel dat de RSX dan heeft t.o.v. de Xenos is dat hij over 300 transitors aan boord heeft, weet iemand wat dat precies inhoud?

    Om maar even terug te gaan naar de CELL, wij zijn niet de enige die over z'n nadelen geinformeerd zijn.
    Ergens in 2005 had Sony de CELL aan Apple aangeboden.
    Apple had Sony bedankt voor het aanbod, maar had toch bekendgemaakt voor Intel hardware te gaan....tja.....smart choice..

    En wat betreft games.... ik had van Mirrik begrepen (die op beide platformen heeft gespeeld op de E3 ) dat titles als Heavenly Swords en MGS4 technisch van identieke kwaliteit zijn als titels als Gears of War, Mass Effect en Oblivion.
    Ook die lui van Gamekings (die ook op de E3 waren) zeiden dat de meeste PS3 games grofweg zo op een 360 zouden kunnen draaien.
     
    Laatst bewerkt door een moderator: 17 jul 2006
  18. Arjan

    Arjan XBW.nl VIP XBW.nl VIP

    Berichten:
    7.207
    Leuk Bevonden:
    3
    Jep, heb het hele artikel al een tijdje geleden op het PS3-forum gelezen. En het nodige op andere websites. En ik begrijp het ook nog eens ;)

    Wat jij hier boven opnoemt is precies hetzelfde verhaal wat ik eerder al poste, alleen laat jij het stukje over de mogelijke bottleneck weg. Wanneer een developer de volledige bandbreedte van de RSX wil benutten, moet ie de Cell aan gaan spreken. Dat is verder geen probleem, maar wel wanneer men tegelijkertijd door de Cell grafische berekeningen uit laat voeren. Dat was mijn punt. Theoretisch gezien zou dit een bottleneck kunnen zijn, wat per definitie niet wil zeggen dat het in de praktijk ook zo uit zal pakken. Kijk maar naar de Xbox 360.

    Edit//

    Men kan er lang of kort over discussieren, maar bottlenecks blijft men altijd houden! De Xbox 360 kent ze, de PS3, PC hardware enz....
    De belangrijkste hardware kan per kloktik altijd meer berekenen dan het die data naar een ander onderdeel kan versturen/ontvangen. Het is net een mooi stukje asfalt tussen twee steden. Allemaal leuk en aardig maar als het druk wordt staat het vast.
     
    Laatst bewerkt: 17 jul 2006
  19. CaptnCook

    CaptnCook Active Member

    Berichten:
    8.478
    Leuk Bevonden:
    10
    Kijk eens naar het weinig (low-res) detail bij MGS 4 op de personages:

    [​IMG]
    [​IMG]
    [​IMG]
    [​IMG]

    Vergelijk dat eens met het rijke detail in Gears of War of in Halo 3:

    [​IMG]
    [​IMG]

    [​IMG]
    [​IMG]

    Alle 3 grafische paradepaardjes (alle 3 2e generatie games).
    MGS 4, in mijn ogen de mooiste PS3, blaast een Halo 3 of Gears of War niet weg. Ik wil dus straks wel eens zien of de PS3 de Xbox 360 wegblaast.

    I rest my case.
     
    Laatst bewerkt: 17 jul 2006
  20. MacDennis

    MacDennis Active Member

    Berichten:
    1.223
    Leuk Bevonden:
    0
    Mwoahhh ..

    Ik heb het hier over titels die minstens een jaar na de 360 verschijnen voor een console die 600 EUR moet opleveren en die volgens sommigen 2x krachtiger moet zijn dan de 360. Dus laten we geen rotte appels met rotte peren vergelijken. ;)

    Warhawk
    http://media.ps3.ign.com/media/748/748468/img_3580867.html

    Fatal Intertia
    http://media.ps3.ign.com/media/749/749967/img_3699010.html

    Motorstorm
    http://media.ps3.ign.com/media/748/748488/img_3723327.html

    Sonic
    http://media.ps3.ign.com/media/770/770960/img_3590788.html
     
Topicstatus:
Niet open voor verdere reacties.

Deel Deze Pagina