شایعه مشخصات نهایی Xbox 720 در Xbox World.

RattleSnake

کاربر سایت
65607681469723726373.jpg




مجله Xbox World در یکی از شماره های خود در 8 صفحه اطلاعات مفیدی را در مورد کنسول نسل بعد مایکروسافت XBox 720 با نام رمز Durango

.گفته های این مجله به حقیقت نزدیک و قابل اطمینان است CVG می گوید:
قبلا نیز این مجله اطلاعاتی منتشر کرده بود که مایکروسافت آنان را بعدها تایید کرد.

Dan Dawkins سردبیر این مجله به CVG می گوید:
Xbox World در مدت 12 ماه این اطلاعات را دریافت کرده است. تغییرات چشمگیری و اخبار اصلی را می توان در شماره قبل از برگزاری E3 و ماه ژوئن شاهد خواهیم بود.
این کنسول در ماه مارس برای سازندگان ارسال خواهد شد. اطلاعات اولیه سخت افزاری این کنسول در 28 فوریه 2012 در کنفرانسی در لندن به سازندگان بازی داده شده است.
مدیر فنی Crytek بعد از نمایش در توییتر خود اعلام کرد : نمایش بهره برداری از Durango را در لندن داشتیم. آن بسیار عظیم و صحبت درباره اش جالب است. جلسات دیگری در سانفرانسیکو برای توسعه دهنگان آمریکایی برای نمایش کنسول برگزار شده



"کنسول نسل بعد مایکروسافت با نام Xbox شناخته خواهد شد و نام رمز این کنسول برای سازندگان Durango اعلام گردیده است کنسول نسل بعد مایکروسافت دارای CPU quad-core به همراه 16 threads
4 هسته logica و مقدار حافظه 8 گیگ خواهد بود در صحبت های این مجله گفته شده است که GPU این کنسول از سری 7000 شرکت AMD هست.



Xbox-720-Mockup-635x422.jpg

Xbox 720 دارای Blu-rays خواهد بود، آن هم برای افزایش میزان ذخیره سازی می شود تلویزیون مشاهده کرد و یا برنامه های تلویزیون را حتی ضبط کرد برای کمک به عرضه بهتر کنسول به همراه آن Kinect 2.0 هم قرار می گیرد با قابلیت دقت بیشتر و همین طور قدرت لب خوانی. به علاوه خروجی و وردی جدید برای TV

این کنسول دارای 4 پوررت USB 3.0 ، دکمه های حساس به لمس و یک هارد درایو خواهد بود.دارای اسلات ها اضافی HDD و HDMI به همراه AV
بازی های این کنسول در دیسک هایی با حجم 8.9GB عرضه می شوند. گفته اند یکی از مزایای این کنسول قابلیت رندر شگفت انگیز آن است




درباره
Kinect 2.0 : این دستگاه قادر خواهد بود 4 بازیکن را همزمان ساپرورت کند یعنی خواندن حرکات همزمان وحتی کوچک ترین حرکت انگشت دست را هم از قلم نمی انداز. با توجه به ارتقا قدرت دوربین و پردازنده این کنسول پاسخ دهی خوب در اتاق های بزرگ این کنترلر قادر است اطلاعات شما و چهره دوستانتان را با دقت بخواند.






 
آخرین ویرایش:
اینم از پاسخ مایکروسافت به این شایعه :
The manufacturer has responded to our mails. “Microsoft does not comment on rumours or speculation. We are always thinking about what is next for our platform, but we don’t have anything further to share at this time
 
شایعه جدید میگه که امکانش هست Xbox بعدی بر پایه ویندوز 8 باشه و از صفحه لمسی پشتیبانی کنه.
شایعه نیست پیش بینی یکی از کارمندان ارشد سابق مایکروسافت هست که حتی گفته بالمر باید استعفا بده. ایشون سال‌ها پیش یعنی اگر اشتباه نکنم قبل از 2004 مایکروسافت رو ترک کرده.
 
آخرین ویرایش:
Microsoft's new Xbox will include improved Siri-like speech recognition | The Verge

احتمال داره کنسول بعدی مایکروسافت از سیستمی مشابه سیستم Siri ی Apple برای تشخیص صدا استفاده کنه.

Microsoft will greatly improve its speech recognition technology inside the next Xbox, The Verge has learned. Sources familiar with Microsoft's Xbox plans have revealed that Durango, the codename for the next Xbox, will support wake on voice, natural language controls, and speech-to-text. The improved capabilities mean that Xbox users will be able to walk into a room and simply say "Xbox on" to wake up the new Xbox.
We understand that Microsoft is also investigating scenarios where a Kinect sensor will detect individuals in a room and suggest appropriate multiplayer games after a user queries the Xbox using voice. The support will include natural language detection, similar to Apple's Siri service, that will let users ask things like "what are my friends playing" to receive a friends list. Xbox will also reply back to users with answers to queries, making it an improved search service too. The current Xbox 360 console lacks natural interaction and context, we're told that's a big focus of the new speech recognition in the new Xbox.
Users will also be able to automatically resume video content where it left off simply by asking the new Xbox to play a particular movie. With speech-to-text built-in, it's likely that Microsoft will utilize this support to type out messages using the new Xbox. It's widely expected that Skype will make its Xbox debut on the new console. Microsoft will fully detail its new Xbox at E3 this year, with suggestions from sources that the company may hold a separate event to unveil its new hardware ahead of E3. The new Xbox is expected to be released later this year.
 
یه پاره اطلاعات جدید از کنسول . حالا من نمیدونم این سایته چطوری اینطوری اطلاعات را کش میره ؟:دی

World Exclusive: Durango’s Move Engines

Moore’s Law imposes a design challenge: How to make effective use of ever-increasing numbers of transistors without breaking the bank on power consumption? Simply packing in more instances of the same components is not always the answer. Often, a more productive approach is to move easily encapsulated, math-intensive operations into hardware.

The Durango GPU includes a number of fixed-function accelerators. Move engines are one of them.

Durango hardware has four move engines for fast direct memory access (DMA)

This accelerators are truly fixed-function, in the sense that their algorithms are embedded in hardware. They can usually be considered black boxes with no intermediate results that are visible to software. When used for their designed purpose, however, they can offload work from the rest of the system and obtain useful results at minimal cost.

The following figure shows the Durango move engines and their sub-components.

di-RIZ3.jpg




The four move engines all have a common baseline ability to move memory in any combination of the following ways:
•From main RAM or from ESRAM
•To main RAM or to ESRAM
•From linear or tiled memory format
•To linear or tiled memory format
•From a sub-rectangle of a texture
•To a sub-rectangle of a texture
•From a sub-box of a 3D texture
•To a sub-box of a 3D texture

The move engines can also be used to set an area of memory to a constant value.

DMA Performance

Each move engine can read and write 256 bits of data per GPU clock cycle, which equates to a peak throughput of 25.6 GB/s both ways. Raw copy operations, as well as most forms of tiling and untiling, can occur at the peak rate. The four move engines share a single memory path, yielding a total maximum throughput for all the move engines that is the same as for a single move engine. The move engines share their bandwidth with other components of the GPU, for instance, video encode and decode, the command processor, and the display output. These other clients are generally only capable of consuming a small fraction of the shared bandwidth.

The careful reader may deduce that raw performance of the move engines is less than could be achieved by a shader reading and writing the same data. Theoretical peak rates are displayed in the following table.

di-PZU6.jpg


The advantage of the move engines lies in the fact that they can operate in parallel with computation. During times when the GPU is compute bound, move engine operations are effectively free. Even while the GPU is bandwidth bound, move engine operations may still be free if they use different pathways. For example, a move engine copy from RAM to RAM would not be impacted by a shader that only accesses ESRAM.

Generic lossless compression and decompression

One move engine out of the four supports generic lossless encoding and one move engine supports generic lossless decoding. These operations act as extensions on top of the standard DMA modes. For instance, a title may decode from main RAM directly into a sub-rectangle of a tiled texture in ESRAM.

The canonical use for the LZ decoder is decompression (or transcoding) of data loaded from off-chip from, for instance, the hard drive or the network. The canonical use for the LZ encoder is compression of data destined for off-chip. Conceivably, LZ compression might also be appropriate for data that will remain in RAM but may not be used again for many frames—for instance, low latency audio clips.

The codec employed by the move engines is LZ77, the 1977 version of the Lempel-Ziv (LZ) algorithm for lossless compression. This codec is the same one used in zlib, glib and other standard libraries. The specific standard that the encoder and decoder adhere to is known as RFC1951. In other words, the encoder generates a compliant bit stream according to this standard, and the decoder can decompress certain compliant bit streams, and in particular, any bit stream generated by the encoder.

LZ compression involves a sliding window and operates in blocks. The window represents the history available to pattern-match against. A block denotes a self-contained unit, which can be decoded independently of the rest of the stream. The window size and block size are parameters of the encoder. Larger window and block sizes imply better compression ratios, while smaller sizes require less calculation and working memory. The Durango hardware encoder and decoder can support block sizes up to 4 MB. The encoder uses a window size of 1 KB, and the decoder uses a window size of 4 KB. These facts impose a constraint on offline compressors. In order for the hardware decoder to interpret a compressed bit stream, that bit stream must have been created with a window size no larger than 4 KB and a block size no larger than 4 MB. When compression ratio is more important than performance, developers may instead choose to use a larger window size and decode in software.

The LZ decoder supports a raw throughput of 200 MB/s compressed data. The LZ encoder is designed to support a throughput of 150-200 MB/s for typical texture content. The actual throughput will vary depending on the nature of the data.



JPEG decoding

The same move engine that supports LZ decoding also supports JPEG decoding. Just as with LZ, JPEG decoding operates as an extension on top of the standard DMA modes. For instance, a title may decode from main RAM directly into a sub-rectangle of a tiled texture in ESRAM. The move engines contain no hardware JPEG encoder, only a decoder.

The JPEG codec used by the move engine is known as ISO/IEC 10918-1, which was the 1994 JPEG committee standard. The hardware decoder does not support later standards, such as JPEG 2000 (wavelet encoding) or the format known variously as JPEG XR, HD Photo, or Windows Media Photo, which added a number of extensions to the base algorithm. There is no native support for grayscale-only textures or for textures with alpha.

The move engine takes as input an entire JPEG stream, including the JFIF file header. It returns as output an 8-bit luma (Y or brightness) channel and two 8-bit subsampled chroma (CbCr or color) channels. The title must convert (if desired) from YCbCr to RGB using shader instructions.

The JPEG decoder supports both 4:2:2 and 4:2:0 subsampling of chroma. For illustration, see Figures 2 and 3. 4:2:2 subsampling means that each chroma channel is ½ the resolution of luma in the x direction, which implies a footprint of 2 bytes per texel. 4:2:0 subsampling means that each chroma channel is ½ the resolution of luma in both the x and y directions, which implies a footprint of 1.5 bytes per texel. The subsampling mode is a property of the compressed image, specified at encoding time.

In the case of 4:2:2 subsampling, the luma and chroma channels are interleaved. The GPU supports special texture formats (DXGI_FORMAT_G8R8_G8B8_UNORM) and tiling modes to allow all three channels to be fetched using a single instruction, even though they are of different resolutions.

JPEG decoder output, 4:2:2 subsampled, with chroma interleaved.

di-JFJ3.jpg


In the case of 4:2:0 subsampling, the luma and chroma channels are stored separately. Two fetches are required to read a decoded pixel—one for the luma channel and another (with different texture coordinates) for the chroma channels.

JPEG decoder output, 4:2:0 subsampled, with chroma stored separately.

di-FCVO.jpg


Throughput of JPEG decoding is naturally much less than throughput of raw data. The following table shows examples of processing loads that approach peak theoretical throughput for each subsampling mode.

Peak theoretical rates for JPEG decoding.

di-WON5.jpg


System and title usage

Move engines 1, 2 and 3 are for the exclusive use of the running title.

Move engine 0 is shared between the title and the system. During the system’s GPU time slice, the system uses move engine 0. During the title’s GPU time slice, move engine 0 can be used by title code. It may also be used by Direct3D to assist in carrying out title commands. For instance, to complete a Map operation on a surface in ESRAM, Direct3D will use move engine 0 to move that surface to main memory.
 
آخرین ویرایش:
اقا اين مشخصات قبلي مگه نهايي نبود ؟چرا هي عوض ميشه ؟

چیزی عوض نشده بلکه داره اضافه میشه:

The Durango GPU includes a number of fixed-function accelerators. Move engines are one of them.

قبلا در مورد توابع ثابت که شتاب دهنده پردازش هستن گفته بودم این move engine هم تازه یه نمونه از توابع شتاب دهنده هست که بکار رفته داخل معماری gpu گویا موارد دیگه ایی هم هستن که هر کدوم وظایفی دارن.
وظیفه این یکی بر میگرده به افزایش سرعت عمل جابه جایی اطلاعات در حافظه بین حافظه اصلی و esramو یه سری موارد دیگه مثل Generic lossless compression and decompression که برای فشرده سازی و اکسترکت اطلاعات نیازی به فشار اضافی روی پردازنده نیست.

توابع ثابت وظیفشون شتاب دادن به پردازش هست.گویا ماکروسافت روی این توابع ثابت حساب خاصی باز کرده و در اینده اطلاعات دیگه ایی هم خواهد امد این فقط یکیش بوده گویا.
 
من فكر ميكردم فقط نينتندو هنوز عشق FixedFunction هست. راستي جناب 8030 الان اين تفاوت نرخ جايجايي اطلاعات توي PS و XBOX مشكل ساز نيست؟

از چه نظر؟

ولی جدای از این مبحث move engine
معمولا این گونه اطلاعات که تا حدودی شاه کلید موفقیت کنسول برای پردازش اطلاعات سنگین به حساب میان میشه گفت خیلی محرمانه هستن چون سالها وقت و ازمایش میبره تا حساب کار دست پژوهشگرها بیاد که چگونه میتونن عملکرد پردازش گرافیک رو با کمترین هزینه ممکن ارتقا بدن .
هر کی این اطلاعات رو لو داده معلومه به مبع اصلی کامل دسترسی داره .
xbox نسل بعدی دارای پهنای باند کمی در قسمت رم سیستم هست و gpu فقط 68 gb/s با حافظه سیستم در ارتباط هست {البته جدای esram که اون بخاطر مقدار کمش برای کارهایی که پهنای باند زیاد میبره و دیتای کم مورد استفاده قرار میگیره} پس قاعدتا مهندسای ماکروسافت کامل در جریان کار هستن و از این گونه ترفندها استفاده میکنن تا هم کمبودها رو پوشش بدن هم بهره نهایی رو افزایش بدن.
 
از چه نظر؟

ولی جدای از این مبحث move engine
معمولا این گونه اطلاعات که تا حدودی شاه کلید موفقیت کنسول برای پردازش اطلاعات سنگین به حساب میان میشه گفت خیلی محرمانه هستن چون سالها وقت و ازمایش میبره تا حساب کار دست پژوهشگرها بیاد که چگونه میتونن عملکرد پردازش گرافیک رو با کمترین هزینه ممکن ارتقا بدن .
هر کی این اطلاعات رو لو داده معلومه به مبع اصلی کامل دسترسی داره .
xbox نسل بعدی دارای پهنای باند کمی در قسمت رم سیستم هست و gpu فقط 68 gb/s با حافظه سیستم در ارتباط هست {البته جدای esram که اون بخاطر مقدار کمش برای کارهایی که پهنای باند زیاد میبره و دیتای کم مورد استفاده قرار میگیره} پس قاعدتا مهندسای ماکروسافت کامل در جریان کار هستن و از این گونه ترفندها استفاده میکنن تا هم کمبودها رو پوشش بدن هم بهره نهایی رو افزایش بدن.

منظورم شبيه مشكل PS3 با DLC هاي skyrim بود. اينكه يكي رم بيشتر داره يكي رم سريعتر براي بازيهاي كه قراره از يكي به ديگري پورت بشه مشكل مشابه ايجاد كنه.
 
اطلاعت جدید دیگه و دلایل انتخاب DRR3 بجای GDDR5 و استفاده از چیپ esram و Data Move Engines و.................. اطلاعات واقعا جالبیه که نشون میده مایکروسافت چگونه میتونه بازدهی 100 درصد از سخت افزار بگیره و پهانی باند خودش را کاملا مورد استفاده قرار بده و همچین جای پای تکنیک های جانکارمک مثل Virtual Texturing و...........

Durango makes US catch the Delorean

Posted on 8 February, 2013 by Urian

First of all, sorry for the delay on this issue, I wanted to do the right thing in this case and I think that once you read the entry will understand it.

A few days ago in VGLeaks they leaked part of the descriptive of the following Xbox GPU documentation, which is a variation of the GCN architecture with some additives, to the naked eye the part "no on measure" can lead to some confusion since Microsoft has made use of a nomenclature that is completely different from that used by AMD. In the following table have names equivalents, so that we can move directly to the part that interests us is that really defines a Xbox 8/Durango.​
[TABLE="class: aligncenter"]
[TR]
[TD="align: center"]Microsoft
[/TD]
[TD="align: center"]AMD
[/TD]
[/TR]
[TR]
[TD="align: center"]Shader Core (SC)
[/TD]
[TD="align: center"]Compute Unit (CU)
[/TD]
[/TR]
[TR]
[TD="align: center"]Local Shared Memory
[/TD]
[TD="align: center"]Local Data Share
[/TD]
[/TR]
[TR]
[TD="align: center"]Global Shared Memory
[/TD]
[TD="align: center"]Global Data Share
[/TD]
[/TR]
[TR]
[TD]Color Block (CB) + Depth Block (DB)
[/TD]
[TD]Raster Back End (RBE)
[/TD]
[/TR]
[/TABLE]

But to better understand the parties "as" we must take the DeLorean.

First trip with the Delorean: a journey to the past
In terms of custom parts, the system's GPU is reminiscent a graphics processor professional not very known to the public given that was not very popular in the domestic sphere, but which, in 2002, included a technology that had never been integrated into a graphics card and would not see integrated hardware level again until the arrival of the GCN architecture, I'm talking about the P10 from 3D Labs.
Beyond the memory bus hardware implementation, 3Dlabs believes that the "Virtual memory" system used in the P10 has much more meaning and potential impact on the 3D market, in fact, is something that John Carmack of id has been asking for a long time in the hardware. The concept of Virtual memory is very similar to used in the CPU memory system: removes the barriers between the different subsystems of memory in the computer, such as the buffer's local, the RAM image main or even the hard disk space, and allows 3 processor access them freely.

In the Virtual memory of the P10 system there is a space of logical addressing of up to 16 GB which is completely divided in 4 KB pages. The RAM on the card essentially becomes a huge cache L2 for the chip, a system which is easy to understand for the compilers.
Not in vain Carmack already had years pushing for the implementation of Virtual memory on the GPUs for some time, in a letter of March 7, the year 2000, he already explained the need for the passage of a virtual memory system, the reason for this was to avoid the so-called "Texture Trashing", Carmack described the problem in the following way:
Almost all of the drivers made a purely LRU memory management. This works correctly while total textures need in a frame to fit into memory once have been loads. The minimum you need a little more than memory that fits into the card, you will see how performance falls sharply. If you have 14 MB of textures to render a frame, your graphics card and it has only 12 MB of available buffers of image, instead of having to upload 2 MB that do not fit. You will have to make the CPU to generate 14 MB of command of traffic that can make to the frame rate of a single digit in many drivers.
His idea to solve the problem already the know all, Virtual Texturing, Carmack once described it as well, keep in mind that this is the year 2000:
Problems with large textures can be solved by simply not using large textuas. Losses, both the texels not referenced can be reduced by cutting all textures of 64 × 64 or 128 × 128. This requires pre-processing, adds geometry, and requires a messy overlap of textures to adjust the seams between these.
Currently it is possible to make an estimate of which are the necessary levels of Mip Map and only to exchange those. An application cannot be exactly calculated levels of Mip Map that will be referenced by the hardware, because of this there are a few small variations between chips and the calculation of the slope can lead to a significant overhead in processing. A bound top conservative may be looking at the normal minimum distance from any reference to the vertex and texture in a foograma. This over-etimaria the necessary textures in a 2 X and it would still leave a great impact when the top level of the Mip Map will load for large textures, but can enable the setting for scenes great style Cathedral unless there is an Exchange.
Smart developers can always work hard to overcome the obstacles, but in this case, there is a clear solution for hardware that simply gives more performance than anything else possible for software and makes life easier each: virtualizing the vision that has its virtual memory card.
With pages of tables, the fragmentation of the addressing is not a problem, and with the rasterizer graphic to having to reload a page when exact 4 KB block is necessary, the levels of the mip maps and textures hidden problems simply disappear. You don't have to do anything sneaky by application or driver, only to manage the indexes of the pages.
The hardware requirements are not very heavy. You need graphic card, the ability to load automatically the TLB buffers (TLB) translation from the pages of tables in local memory, and the ability to move a page through the AGP or PCI graphics memory and update the pages of tables and reference counter. Don't even have several TLB, since access patterns are not jumping po all coo memory can the CPU. Even with a single TLB for each unit of textures, refills would have only 1/32 of the memory access if the textures were blocks of 4 KB. Everything you want is that the limit superior outside a TLB large enough so that each texture covers the texels referenced in the typical raster by scanline.
Some developers will say "I do not want the system to handle textures, wants total control" there are a couple of answers to this, first the management pages have the flexibility you don't get by a scheme by software, so you have new capabilities. Second, you can still continue to treat as if it were a fixed texture buffer and do your mimo with updates. Third parties, even if this were slower than the scheme possible to outsmart software (which I seriously doubt), will exchange development time by something that is theoretically more efficient and faster. We not already code in Assembly language overlays!
Some hardware designers will say something that the graphics engine is in waiting while they getting obtaining data from a page from the CPA. Sure, it will always be better to have enough space for textures and not have to always make the Exchange, and this feature would not allow you to talk more about megapixels, or millions of triangles, but each card ends up not having enough memory at a given point. Ignore these cases from the real world does not help to your customers. In any case, infernal waiting for this is much less than if these loading texture learns from the command FIFO.
It is assumed that 3Dlabs will have some form of virtual memory management on the permedia 3, I'm not familiar with the details (Yes someone from 3D labs can send me the latest registered specifications, it detects it!).
P10 = Permedia 3. Later comments on the use of RAM embedded for this concept:
Embedded DRAM should be a driving force. It is possible to place a large amount of megabytes of high bandwidth on a chip with a video driver, worse it will not be possible (for now) to put the 64 MB of a GeForce there. With the virtualized texturing, pressure on the memory is drastically reduced. Even with a 8 MB card would be enough for game to 1024 × 768 and 16-bit or 32-bit and 800 × 600, no matter whatever the burden of textures.
Second trip with the Delorean: a near present.
In one of the entrances on Durango talk precisely of the same item, the implementation of Virtual Texturing but related to a patent on behalf of Mark S. Grossman and Microsoft, are what comes now is a Deja Vu, but is helping to connect the dots between the past and the future.
________________________________________________________________________________________________________________________
On the other hand, there is an element that makes me think that this is the architecture chosen by Microsoft is the 7 HD × 00/GCN and is partially resident textures -
Is that friend Grossman has a patent assigned to Microsoft and with him as the inventor that perfectly describes this topic, despite the fact that this technology is not integrated within the specification of DirectX 11, the fact that Microsoft has a patent assigned to the same technology is another track that relate the following console of Microsoft with this GPU.
As you can see the Texture Unit in this diagram, not only you can read the main RAM for texture, you can also read a map tiles, the patent reads as follows:
Map of Tiles can specify the Tiles that are stored in the memory for textures. In one application, the map Tiles may contain one or more tables that can be used to determine the level of detail (if any) available for each Tile, so the map Tiles can reside on a memory unit.In some applications the map Tiles can be found in the memory for textures, but it is not required.
A map or table Hash, is a data structure that associates a particular key with a value index, so the keys stored within the map Tiles correspond each to a memory address concrete and the address of memory contains data. The patent says that the map of Tiles is in another memory which is not memory where are stored the textures but another... and another memory we have a map of Tiles available and accessible by all the TMU? Because the cache second-level GPU, which means that in the case of Xbox 8 and given that would the CPU and the GPU is communicated through the cache that the CPU could write GPU where textures in the same cache L2 so that the GPU can read it. On the other hand the expression that the map of Tiles is in memory for textures, does not mean only that the map Tiles can insert into main memory but also can include textures in the same memory that is the map Tiles, which entails the use of a memory embedded in the GPU. All textures within the eDRAM memory obviously cannot be placed, nor even the whole scene would take, it can be pulled by a Tile Rendering similar to which there is in the PowerVR and Xbox 360, which would also mean the use of this memory as an accumulation buffer.
The accumulation buffer, known as FrameBuffer Object in the jargon of OpenGL is a section of memory where you can calculate a frame with the peculiarity that will not be the final frame and on the same data is calculated several times. It is used in techniques such as the Tile Rendering and effects of post-processing based on manipulating the final image. You can work with it using a full-frame or otherwise making use of small pieces of the frame, much easier to store in buffers. Effects such as Alpha Blending, Motion Blur, the different types of AA... are effects that depend on much, but lot of bandwidth when calculating it, is for this reason that dividing the frame into fragments and go running with them memories closer to the processor, as in the case of Xbox 8/Durango would be the caches.
For most loop the loop, if we take into consideration the filtrate in VG Leaks on the alpha of the next Xbox kits:
Kit Alpha uses a separate graphics card similar in capacity and speed to the GPU to be include in the final design. Card does not have the ESRAM which will take the final GPU design.
_________________________________________________________________________________________________________________________
Looking a bit more documentation I have found information about a technique presented by Sean Barrett in the 2008 GDC called Sparse Virtual Texturing or SVT, which is the same as the PRT of AMD.
Sparse Virtual Texturing is a form of simulation of large textures using much less memory than would be required to loading only the data when they are needed, and using a pixel shader to map from a huge virtual texture made the current physical texture.
The technique can be used for very large textures, or simply for large amounts of small textures (grouping all of them in a huge texture, or using multiple table pages).
It has been inspired by the descriptions of the MegaTexturing of John Carmack in several private forums and emails. It is not exactly the same as the MegaTexture but is approached.
In full year 2008, the management of memory in GPUs remained virtual memory hardware-level support and therefore had to make an implementation of the idea software, recently with GCN architecture AMD has implemented all this hardware.



The problem with this implementation is not part neither of the current version of DX or OpenGL, so it is not a completely standardized technology, on the other hand some... you say that does this special if it is proper of the GCN architecture that will be used by PS Orbis? Well, keep in mind that aside from Xbox 8/Durango have not yet arrived, but this previous part is necessary to understand the architecture of the new system.
Third trip with the Delorean: Durango
There are three elements of the following console from Microsoft that are interesting, are as follows:

  • Virtual Texturing
  • ESRAM
  • Data Move Engines
All these parties revolve around the same concept, the implementation of Virtual Texturing hardware.
ESRAM
Durango has no video memory (VRAM) in the traditional sense, but the GPU does contain 32 MB of fast embedded SRAM (ESRAM). ESRAM on Durango is free from many of the restrictions that affect EDRAM on Xbox 360. Durango supports the following scenarios:
Durango has no memory (VRAM) video from the traditional point of view, but the GPU contains 32 MB of embedded SRAM (ESRAM). The ESRAM in Durango, is free of many of the restrictions that affect the EDRAM on Xbox 360. Durango supports the following scenarios:

  • Texturing from the ESRAM
  • Render surfaces in the main RAM
  • Read from a render target without having to make a decision (in certain cases)
The difference in flow between the ESRAM and main RAM is moderate: 102.4 GB/sec front 68 GB/sec. The advantages of the ESAM is a lower latency and a lack of restraint by other customers of the memory, e.g. CPU, I/O, output to screen. Low latency is particularly important to maintain the performance of the color blocks (CB) and the depth blocks (DB).
By strange and surprising that it seemed to a next generation console: Xbox 8/Durango has "only" 32 MB of video memory. do you understand the reason why I have made reference to the 3DLabs Permedia 3 now? 3DLabs card included memory did the same work that makes the ESRAM next Microsoft console, this means that the L2 cache of the GPU and memory controllers that a traditional configuration would be connected to the external memory here are directly connected to the ESRAM. The current APU from AMD on the PC to communicate with the external memory GPU makes use of the Radeon Memory Bus, which has a bandwidth of 256 bits in each direction per channel memory (256-bit of reading) and 256-bit of writing, in the case of Kryptos we find that bandwidth is 1024 bits in total, so added a dual controller or the existing width has been increased.
This is the first part of the puzzle, we are still missing the other two parties.
Virtual routing
All access to the GPU in Durango memory using virtual addresses, and therefore pass through a translation table before working out in the form of physical address. This layer of indirection solves the problem of fragmentation of memory hardware resources, a single resource can occupy several non-contiguous pages of physical memory without penalty.
Virtual addresses can take aim pages in the main RAM, the ESRAM, or can not be mapped. The Shader read and writes the pages not mapped in well defined results, including optional error codes, rather than block the GPU. This ability is important for the support of resources in "tiles", which are partially resident in physical memory.
The benefits of the face graphics Virtual address does not need to repeat again as above, but above all what's noteworthy is the use of the same for Virtual Texturing, which is its main function, the most direct utility is to break the direct link between the information available on each frame and the bandwidth of the memorythe reason for this is simple, in the traditional method, you have to load the entire page of textures full through the bandwidth, here only loads what is needed in each frame, but this is something that has been commented more above.
But of course, as we are at this level of the last piece of the puzzle still missing description, is worth that the GPU will see all of the main memory, but... do as accessed if their memory controllers are directed towards the ESRAM? Hence through DME or Data Move Engines are the third and final part of the puzzle.
Data Move Engine


The Data Move Engines are something that can initially cause headaches but fully understand when you consider that their function is the same as the the PCI Express on the PC, while thanks to unify into a single chip CPU and GPU seems to be that the PCI Express already loses all its raison d'etre, actually has a less known function for being less used to it is the provide direct access to the main system memory for different devices, connected to the PCI Express port, so the GPU has access to 8 GB of DDR3 system memory.
But the DME beyond, is the problem of virtual memory in the GCN architecture support is not complete and does not support 100% hardware, but that is a combination between software through Shaders and hardware, the idea of the DME is that virtual memory management can be performed automatically by these or alternatively give freedom to the developer not to use this type of memory management according to see it suitable.


The concept is as follows, Shaders memory management sucks by these cycles, which despite the theoretical peak is higher, real available bandwidth may be lower and if you read the summary of the text written by Carmack above you will see that it refers to the waiting time that data arrive from main memory to the memory of video through the DMA's turn, as well, the DME eliminated this problem downloading this task of Shaders and making it work in parallel computing of the scene.
The advantage of the Move Engines lies in the fact that can operate in parallel with the computer. During the times when the GPU is attached by the computer, the Move Engine operations are effectively free. Even when the GPU is limited by bandwidth, the engines move can use different paths. For example, a move engine which copy of the RAM to RAM will not be impacted by a shader that only accesses the ESRAM.
With DME the puzzle is completed and this full hardware support to 100%, the management of virtual texturing does not have to be implemented within each graphic engine and therefore it is possible to use all the graphics engines and therefore on all games. The advantage of this is that the "FLOPS" which would be spent on memory management can be directed to other tasks allowing greater use of the GPU for computer graphics, but the use of not only DME is limited to Virtual Texturing, keep in mind that they can load any data in the ESRAM and that this charge them directly in the caches of the GPU. Do you think for example loading an Octree node for example? Use goes beyond Virtual Texturing.
But this can be applied in PS Orbis? Basically to implement virtual memory management it is not necessary that there are two levels of memory, taking the top sufficient storage capacity for the image (color, depth, and stencil) buffers and textures needed for the scene and there is a lower level. For PS Orbis the caches of the GPU do not have enough storage capacity for this and the GDDR5 is a single level of memory for all of the GPU. Obviously the ESRAM and all the mechanism implementation costs in the space that is a sacrifice in terms of computation capability. But the biggest advantage comes from the fact that this allows access to large amounts of memory per frame without having to rely on huge band widths from expensive high-wattage as the GDDR5 memory. The reason why Xbox 8/Durango uses GDDR5 is not by the fact that then the thing would be completely redundant, the GDDR5 exists on the GPUs of face to avoid the Texture Trashing by the use of a higher bandwidth, the use of virtual memory on the GPU and Virtual Texturing are another solution to the same problem that both come into conflict within a system.
I hope that the article has left things clear and clarified any confusion.
 
آخرین ویرایش:

کاربرانی که این گفتگو را مشاهده می‌کنند

تبلیغات متنی

Top
رمز عبور خود را فراموش کرده اید؟
اگر میخواهی عضوی از بازی سنتر باشی همین حالا ثبت نام کن
or