CELL processor architectures programming guide

gamer220 · Aug 22, 2006

:cheesygri

http://www.blachford.info/computer/articles/CellProgramming3.html
http://www.blachford.info/computer/articles/CellProgramming2.html
http://www.blachford.info/computer/articles/CellProgramming1.html
http://www.osnews.com/story.php?news_id=14969
http://www.blachford.info/computer/Cell/Cell0_v2.html
http://www.blachford.info/computer/Cell/Cell4_v2.html
http://www.blachford.info/computer/Cell/Cell1_v2.html
http://www.blachford.info/computer/Cell/Cell3_v2.html
http://www-128.ibm.com/developerworks/power/library/pa-cell/

-----------------------------------

Code:

skip to main content
	
developerWorks  >  Power Architecture technology | Linux  >
Spufs: The Cell Synergistic Processing Unit as a virtual file system

The Linux programming model for Cell
	developerWorks
	
	
Document options
	Set printer orientation to landscape mode	

Print this page
	Email this page	

E-mail this page

Free download:
		

Using Apache Tomcat but need to do more?

Rate this page
		

Help us improve this content

Level: Intermediate

Arnd Bergmann (arnd@arndb.de), Kernel Hacker, Linux on Cell kernel maintainer, IBM Deutschland Entwicklung GmbH

25 Jun 2005

    Base platform support for Linux on the Cell has been established and is currently on its way into the mainstream Linux kernel tree. Read about the Cell's unique architecture and the SPU file system interface that allows Linux to run on it.

	

More dW content related to: programming for CELL processor
	

    This article is adapted from the paper The Cell processor programming model presented at LinuxTag 2005; see the Resources section for more details.

The Cell processor from Sony, Toshiba, and IBM® is this year's most awaited newcomer on the CPU market. It promises unprecedented performance in the consumer and workstation market by employing a radically new architecture. Built around a 64-bit PowerPC® core, multiple independent vector processors called Synergistic Processing Units (SPUs) are combined on a single microprocessor.

Unlike existing SMP systems or multicore implementations of other processors, on the Cell, only the general purpose PowerPC core is able to run a generic operating system, while the SPUs are specialized to run computational tasks. Porting Linux™ to run on Cell's PowerPC core is a relatively easy task because of the similarities to existing platforms like IBM pSeries® or Apple Power Macintosh, but this does not give access to the enormous computing power of the SPUs.

Only the kernel can directly communicate with an SPU and therefore needs to abstract the hardware interface into system calls or device drivers. The most important functions of the user interface include loading a program binary into an SPU, transferring memory between an SPU program and a Linux userspace application, and synchronizing the execution. Other challenges are the integration of SPU program execution into existing tools like GDB or OProfile.

A joint team of Sony, IBM, and Toshiba employees based in Austin, Texas, did the groundwork for the Linux kernel port. The current set of kernel patches is based on the latest 2.6.xx snapshot kernel and is maintained by the IBM LTC (Linux Technology Center) team in Böblingen, Germany. The team hopes to integrate most of this into the 2.6.13 kernel release so it will become part of upcoming distribution releases.

The Cell processor

PowerPC Processing Element

The Cell processor has a PowerPC Processing Element (PPE) that follows the 64-bit PowerPC AS architecture, as the PowerPC 970 CPU (also known as the G5) and all recent IBM POWER™ processors also use. Like the 970, it can use the VMX (AltiVec) vector instructions to parallelize arithmetic operations.

Moreover, the Cell processor can use simultaneous multithreading (SMT) like the IBM POWER5™ processor or Intel®'s Pentium 4 processors with Hyperthreading.

The IBM LTC has a standard Linux distribution running on the PPE and needs only a small number of kernel patches to add support for some of the hardware features that differ from existing target platforms. In particular, the Cell processor includes an interrupt controller and an IOMMU implementation, both of which are incompatible with those supported by older kernel versions.

The hardware we are running on at the LTC is a prototype of the Cell processor-based Blade, with two Cell processors running as a symmetric multiprocessing (SMP) system and, currently, 512MB of memory. It is designed to be used in an IBM BladeCenter™ chassis.

The integration of support for the PPE in one of the next kernel releases will enable the use of a single kernel binary for all current 64-bit PowerPC machines including Cell, Apple Power Mac, and IBM pSeries.

While no plans are in place to support 32-bit Linux kernels on Cell, it is possible to run both 32- and 64-bit distributions on it using the PowerPC 64 kernel with support for the ELF32 binary format. Note that all 32-bit PowerPC applications are expected to work without modifications.


	Back to top


Synergistic Processing Elements

The Synergistic Processing Elements (SPEs) are the most interesting feature of the Cell processor, as they are the source of its overwhelming processing power. A single chip contains eight SPEs, each with an SPU, a Memory Flow Controller (MFC), and 256KB of SRAM that are used as local store memory.

An SPU uses vector operations itself and can thereby execute up to eight floating point instructions per clock cycle.

Bus interfaces

The Cell processor has three high-speed bus interfaces, one for memory and two for I/O or SMP connections. The memory interface connects XDRAM chips, which currently is the fastest available memory technology, substantially faster than current DDR or DDR2 interfaces.

Like the memory interface, the other two interfaces are also based on Rambus technology. One of them is used exclusively to connect I/O devices, typically a south bridge or north bridge chip for the FlexIO protocol. The other one can also be used for I/O, or alternatively as a coherent interface to connect multiple Cell processors to an SMP system.

Basic SPU design

An SPU resembles a cross between a simple CPU design and a digital signal processor. It can use the same instructions to do either 32-bit scalar or 128-bit vector processing. It has an 18-bit address space that accesses 256KB of local store that are part of the chip itself. Neither a memory management unit nor an instruction or data cache are used. Instead, the SPU can access any 128-bit word in the local store at L1 cache speed.

Memory Flow Controller

The MFC is the main communication vehicle between the local store memory and the system memory. As mentioned before, there is one MFC in each SPE. It has an integrated memory management unit that is normally used to provide access to the address space of one process by using the same page table lookup as the PPE.

A DMA request always involves moving data between the SPE local store and a virtual address space on the PPE side. The types of DMA requests include aligned read and write operations as well as single word atomic updates that can be used -- for example -- to implement spin-locks that are shared between SPEs and user processes.

Both the SPE and the PPE can initiate DMA transfers. The PPE does this through memory-mapped register access from kernel mode, while the SPE writes to its DMA channels from code running on the SPU.

An MFC can have multiple concurrent DMA requests to one address space outstanding from both the PPE and the SPU. Each MFC can access a separate address space.


	Back to top


Instruction set

Programs running inside the SPU need to be rather simplistic and self-contained, so you don't need complicated access protection or different privilege modes in the SPU itself. As a consequence, the instruction set contains mostly arithmetic and branch operations but none that resemble kernel mode instructions of the PPE.

Also, exceptions resulting from executed code aren't reported to the SPU itself. If a serious error occurs, for example, an invalid opcode, the SPU is stopped and an interrupt is delivered to the PPE. Some of the common sources of exceptions are not even possible on the SPU. For example, there are no addressing exceptions since all pointers get aligned and truncated to the local store size when attempting a memory access.

The arithmetic vector operations are similar to the VMX operations of the PPE, and you can use them for highly optimized video, image processing, or scientific applications, among others.

The main communication method of the SPU with other parts of the Cell processor is defined by a number of "channels." Each channel has a predefined function and is either a read channel or a write channel.

For example, a mailbox mechanism is a basic communication method between the SPE and the PPE. The SPU has a read channel for receiving a single data word from the mailbox and two write channels for sending data words (more on this below). One of those write channels is defined to generate external interrupts on the CPU when data is available, and the other does not have a notification mechanism.

When an SPU tries to read from an empty mailbox, it will stop execution until some value is written to its memory-mapped register.

When the PPE wants to access the mailbox, it needs to have access to the memory-mapped register space, which is normally only available to kernel space. It has three mailbox registers for each SPU, and each of those accesses one of the three SPU mailbox channels.

The memory-mapped registers are used by the PPE to control certain aspects of an SPE, but are not accessible by the SPU code itself. For example, one PPE-side mailbox register appears as a write-only physical memory location. When the PPE writes a data word to that address, the SPU can read from its corresponding mailbox read channels.

Other channels are used to access virtual memory associated with a user context on the PPE. By writing to DMA channels, the SPE can initiate a memory transfer, which is executed in parallel to both the SPU code execution and the PPE control flow. Only when a page fault is hit, for example, because the accessed page has been swapped out to disk, does the PPE receive an interrupt.


	Back to top


Possible programming models

Character devices

Some kernel code is needed to use the SPUs from a Linux application, since the controlling registers are only accessible from the PPE in privileged mode. The simplest way to give userspace applications access to hardware resources is through a character device driver controlled through read, write, and ioctl system calls.

This is suitable for many simple devices and at some point was used for testing the capabilities of the processor, but the approach has a number of problems. Most importantly, if each SPU is represented by a single character device, it becomes hard for an application to find an SPU that is not yet used by another. Moreover, that interface does not allow virtualization of the SPUs on a multiuser system in a sane way.

System calls

A different approach to using SPUs is to define a set of system calls. This makes it possible to replace physical SPUs as the underlying unit of the abstraction from processes running on the SPU. SPU processes can be scheduled by the kernel, and all users can create them without interfering with each other. On the downside, this also means duplicating some infrastructure of the kernel as well as adding a potentially large number of new system calls to provide all necessary functionality.

For example, when a new thread ID space is managed next to the existing Linux process IDs, substantial changes to all system calls dealing with PIDs (kill, getpriority, ptrace, and so on), or alternatively new versions of those system calls, need to be provided. Neither

alternative is desirable from a cross-platform point of view.
-------------------------------

cell از کلاس های استاندارد زبون سی و سی پی پی استفاده میکنه

gamer220 · Aug 22, 2006

http://research.scea.com/research/html/CellGDC05/index.html
http://findarticles.com/p/articles/mi_m0ISJ/is_1_45/ai_n16129817
http://news.taborcommunications.com/msgget.jsp?mid=561736&xsl=story.xsl
http://whitepapers.zdnet.co.uk/0,39025945,60167271p-39000410q,00.htm
http://news.taborcommunications.com/msgget.jsp?mid=561736&xsl=story.xsl
http://news.com.com/Linux+gets+built-in+Cell+processor+support/2100-7344_3-6052314.html
http://news.com.com/PlayStation+3+chip+goes+easy+on+developers/2100-1043_3-5476933.html
http://developers.slashdot.org/article.pl?sid=06/07/14/1614223

Listing 1 SPE code example

/*************************/
/* filename: spe_hello.c */
/*************************/
#include <cpio.h>
/* here we declare an EAR for foo's EA */
extern unsigned long long _EAR_foo;

char hello_string[] = "Hello World!";

int main(long long spuid, char** argp, char** envp)
{
/* Here we copy the string to the foo array in EA */
copy_from_ls(_EAR_foo, hello_string, sizeof(hello_string));

return 0;
}

/* this section maybe generated automatically by an SPE compiler */
/*****************************/
/* filename: spe_hello_toe.s */
/*****************************/
section .toe, "a", @progbits
.align 4
.global _EAR_foo
_EAR_foo:
.octa 0x0

-------------------------------

/************************/

/* filename: ppe_main.c */

/************************/

/* the symbol "spe_foo_handle" defined in spe_hello_csf.o */

extern spe_program_handle_t spe_hello_handle;

/* an EA symbol representing an array object */

char foo[512]; /* this is the foo array that spe_hello accesses */

int main()

{

int rc, status;

speid_t spe_id;

/* load & start the spe_hello program on an SPE */

spe_id = spe_create_thread(0, &spe_hello_handle, 0, NULL, -1, 0);

/* wait for spe prog. to complete and return its status */

rc = spe_wait(spe_id, &status, 0);

printf("string from spe_hello: %s\n", foo);

return status;

}

------------------

gamer220 · Aug 22, 2006

تو سخت افزار playstation 2 برنامه نویسا بیشتر با سی و اسمبلی ها کار داشتن و برنامه نویسی نسبتا سخت بود ، تو معماریه ps3 میبینیم ، سل از کتاب خونه های استاندارد سی و سی پی پی استفاده می کنه که به زبون ماشین سل پورت شدن ، ولی این نشون دهنده ی راحتی برنامه نویسی نیست ، سل از یک معماریه پیچیده و کمی گنگ استفاده میکنه و استفاده از قابلیت های اون مشکله ، مدیریت حافظه ی دستی ، کار کردن با پیپ لاینها ، بخش اعظمی از کار برنامه نویسیه سخت افزاره :

http://www.internetnews.com/ent-news/article.php/3469111
http://www.computerworld.com.au/index.php/id;840513452;fp;2;fpid;1
http://www.1up.com/do/newsStory?cId=3138237&did=1
http://events.ccc.de/congress/2005/...r.pdf#search="programming for CELL processor"
http://merchant4u.electricvenom.com/cell-processor.html
http://isg.cs.tcd.ie/eg2005/IS4.html
http://pcburn.com/article.php?sid=1539
http://www-306.ibm.com/chips/techlib/techlib.nsf/techdocs/9F820A5FFA3ECE8C8725716A0062585F
ساخت بازی به زبونی میونی نیاز داره ، زبونی که هم بشه کارای سطج پایین و نزدیک به سخت افزار مانند assembly و کارهای سطح بالا مثل بیسیک ، پاسکال و .... انجام داد . پردازش و ساخت 3d توسط opengl انجام میشه ( ساخت محیط 3 بعدی ، جعدا از مدیریت پردازش - تو سل مدیریت پردازش مثل بازی های کامپیوتر به directx یا opengl سپرده نمیشه ( این 2 یک سری کتابخونه ی اجزای 3 بعدی هستند و جدا از این مدیریت کامل به اخت افزار برای کنترل اون جهت اجرا شدن بازی یا هر کار 3 بعدی دیگه ای دارن ) ) .

http://www-306.ibm.com/chips/techlib/techlib.nsf/techdocs/9F820A5FFA3ECE8C8725716A0062585F/$file/BE_Handbook_v1.0_10May2006.pdf
http://merchant4u.electricvenom.com/cell-processor.html
http://isg.cs.tcd.ie/eg2005/IS4.html

شاید براتون سوال پیش بیاد چرا بازی ها رو با جاوا نمی نویسن ؟ جاوا که هم قدرتمنده هم قابل حمل ؟ ( روی هر سیستم عاملی اجرا میشه )

جاوا یک ماشین مجازی به نام java virtual machine داره که با سی نوشته شده ( کامپایلر جاوا هم همینطور ) ، ماشین مجازی مثل یک سی پی یو میمونی ( مجازی ) که قبل از اینکه کدهای نوشته شده به زبون ماشین کامپایل بشن ، روی ماشین مجازی کامپایل میشن و کد هایی هم که کامپایل میشه ، کدهای قابل فهم " خود جاواست " نه سی پی یو . یکی از دلایل قابل حمل بودن جاوا اینه که جاوا یک پلتفرم هست مثل وین 32 ، وین 64 ، دات نت و ..... جاوا برای قابل حمل ودنش مجبور مقل ویندوز یک سری api مخصوص پلتفرمش بنویسه ، شما ماشین مجازی رو یک ویندوز درون ویندوز فرض کنید !!!! جاوا اجازه نمیده شما در بیشتر اوقات از api های سیستم عامل استفاده کنید ، برای همین شما مستقیما از api خود پلتفرم جاوا استفاده می کنید . خوب ، حالا برنامه نوشته شد - جاوا .exe ایجان نمیکنه بجاش jar میسازه ، این فایل حاویه کدهای بایت کد ( کدهایی که ماشین مجازی تولید میکنه ) هستش ، و در لحظه ی اجرا به ماشین تبدیل میشه که این کار خیلییی از سرعت برنامه ها کم میکنه ، و پلتفرم جاوا مثل اوپن جی ال چون یک cross platform هستش سرعت کمی داره ، برای همین اصلا برای کارهای 3 بعدی مناسب نیست . فعلا بیشترین سرعت رو سی پی پی داره .

http://uk.builder.com/0,39026540,39282923,00.htm
http://www.theregister.co.uk/2005/02/01/cell_analysis_part_one/
http://www.cell-processor.net/e107_plugins/chatbox_menu/chat.php
http://www.embeddedstar.com/press/content/2006/2/embedded19585.html

Cell processor SDK launched

http://www.eetasia.com/ART_8800425646_499495_6e4dc60e200607.HTM
اضافه میکنم ps3 ابزار سختیه برای بازی سازی ، در حد ساخت بازی با asm !!

gamer220 · Aug 22, 2006

حتما بخونینش :

Code:

IBM reckons Cell, potent and versatile, can do a lot more than just play games. It sees a role for it in mobile phones, handheld video players, high-definition televisions, car design and more. Scientists at Stanford University are building a Cell-based supercomputer. Toshiba plans to use the superchip in TV sets, which one day could let fans watch a football game from multiple camera angles they control. Raytheon is set to use Cell in missile systems, artillery shells and radar. Other companies envision new high-definition medical imaging. "Cell is the next step in the evolution of the microprocessor. It's a peek into the future," says Craig Lund, chief technology officer at Mercury Computer Systems, which makes medical and military systems and is taking orders for Cell servers.


Quote:
An IBM demo shows the contrast. A terrain rendering program lets you fly over Mount Rainier at 1,300mph. Cell crunches through millions of lines of topographical and photographic data per second to paint topographically accurate, photo-quality pictures at a movie-quality 30 frames per second. On a similar program a Pentium takes more than two minutes to sketch a single frame.


Quote:
IBM is already at work on beefier versions of Cell, and it has launched an allout campaign to woo a new generation of code-crunchers and game boys to write software for its futuristic chip. In an extraordinary move IBM disclosed hundreds of Cell's design secrets on the Internet, releasing a developer's guide that 10,000 programmers have since downloaded. IBM, with annual sales of $94 billion, says Cell could power hundreds of new apps, create a new video-processing industry and fuel a multibillion-dollar buildout of tech hardware over ten years.


Quote:
Cell's creators needed to strike a balance between raw power and the versatility to do more than just play games. Special graphics chips are superspeedy, but for only one task. General-purpose chips like those made by Intel devote a lot of muscle to the ability to handle a wide variety of jobs, but they aren't superfast at any one of them. For two decades Intel boosted performance by cramming more transistors onto a chip, but now chips draw so much power and generate so much heat that they can't be cranked up much more. Intel and others boost performance by lashing together two or more thinking elements on a single chip. Intel makes dual-core chips. Sun's Niagara boasts eight cores. For Microsoft's Xbox 360, IBM linked three Power cores. But even these multicore chips will not be powerful enough to drive the next wave, Kahle argues. Cell needed an entirely new design.


Quote:
Last year IBM began its own evangelizing. Instead of revealing design details to only a small number of potential partners sworn to secrecy, IBM trumpeted Cell's secrets on the Internet, releasing 700 pages of documents describing the new architecture and a 1,100-page development kit, free for Internet download. "We've opened up the architecture and provided all the details," Kahle says. "We want to see this architecture proliferate in the marketplace."



Quote:
the wooing is necessary, for Cell's "asymmetric" design (its eight co-processors have a different architecture than the main core), though key to the chip's superior performance, is also what makes writing software for it so difficult. In the mainstream chip world designers use an array of tool kits and high-level programming languages (such as C++) to easily convert instructions into a form the chip can comprehend. Such tools exist for Cell, but the chip's design is so complex and so utterly different from anything before it that code-crunchers do some of the work "down on the metal," cranking out basic assembly code, which can take five times as long.

The good news: Some designers say creating games for Cell is far less complicated than writing for PlayStation 2. "Anyone who worked on the PlayStation 2 is jumping for joy," says Jeremy Gordon, chief executive of Secret Level, a gamemaker in San Francisco that is remaking a classic 1980s Sega videogame for the new Sony box.


Quote:
The PlayStation hook inspires confidence at Raytheon, the Waltham, Mass. defense contractor, which has studied Cell for 15 months and plans to use it in scores of next-generation systems. "Sonar, infrared sensors--there are hundreds of products that Raytheon designs that could use this type of technology," says Peter Pao, chief technology officer. "Current chips are going to run out of steam. We always look to the future."

At Mercury Computer Systems in Chelmsford, Mass. engineers are working on a Cell system called Turismo, which is due later this year and will pack up to 128 Cell chips into a 6-foot-high rack, producing up to 25 trillion calculations per second. Mercury, which sells modules for medical gear made by General Electric, Philips and Siemens, says Turismo could make a CT scanner so fast that it will be able to paint a 3-D image in four seconds versus five minutes on an Intel Pentium. Mercury is even pushing Cell to firms that create computer-generated special effects for movies. "This chip is opening doors for us," says Joel Radford, a Mercury vice president.

gamer220 · Aug 22, 2006

Code:

/*************************/
/* filename: spe_hello.c */
/*************************/
#include <cpio.h>
/* here we declare an EAR for foo's EA */
extern unsigned long long _EAR_foo;

char hello_string[] = "Hello World!";

int main(long long spuid, char** argp, char** envp)
{
/* Here we copy the string to the foo array in EA */
copy_from_ls(_EAR_foo, hello_string, sizeof(hello_string));

return 0;
}

/* this section maybe generated automatically by an SPE compiler */
/*****************************/
/* filename: spe_hello_toe.s */
/*****************************/
section .toe, "a", @progbits
.align 4
.global _EAR_foo
_EAR_foo:
.octa 0x0

-------------------------------


/************************/

/* filename: ppe_main.c */

/************************/

/* the symbol "spe_foo_handle" defined in spe_hello_csf.o */

extern spe_program_handle_t spe_hello_handle;

/* an EA symbol representing an array object */

char foo[512]; /* this is the foo array that spe_hello accesses */

int main()

{

int rc, status;

speid_t spe_id;

/* load & start the spe_hello program on an SPE */

spe_id = spe_create_thread(0, &spe_hello_handle, 0, NULL, -1, 0);

/* wait for spe prog. to complete and return its status */

rc = spe_wait(spe_id, &status, 0);

printf("string from spe_hello: %s\n", foo);

return status;

}


------------------

میگن برنامه نویسیه ps3 سخته همینه !!!!

این همه کد فقط یک رشته به نام Hello world ( معروفترین برنامه برای شروع هر زبونی ! ) رو به یکی از spe ها میفرسته و ازش جواب میگیره ! یعنی این رشته به یکی از Spe فرستاده میشه ، وا با ساخت یک thread در اون رشترو بر میگردونه

آخرشم hello world رو نشون میده . حالم بهم خورد !

همین برنامه به زبون سی ( سی پی یو کامپیوتر )

#include <stdio.h>
int main(){
printf(" xonce.net");
return 0;
}

gamer220 · Aug 22, 2006

طریقه ی پیاده سازیه کامپایلر برای cell processor
http://www.xonce.net/files/OCFCP.pdf

software developemnet kit و emulator برای CELL Processor :

http://dl.alphaworks.ibm.com/technologies/cellsw/CBE_SDK_Guide_1.1.pdf

http://xonce.net/images/cmldc.JPG
http://xonce.net/images/cmblc.JPG

pandora tomorrow · Aug 22, 2006

سلام.
دوست گرامی اینجا یه فروم فارسی زبان هست بهتر نیست یکم توضیح فارسی هم بدی؟؟

مثلا من نمیدونم این برنامه ها به چه درد میخورن

gamer220 · Aug 22, 2006

زندگی کنید :

http://www-306.ibm.com/chips/techlib/techlib.nsf/techdocs/05CB9A9C5794A5A8872570AB005C801F/$file/2056_IBM_TRE.pdf
terrain rendering engine guide

Code:

The IBM Cell Broadband Engine™ (Cell BE) SDK, Version 1.1, is a complete Cell BE development environment. The SDK contains binaries and source code that are available for downloading from both alphaWorks and Barcelona Supercomputing Center's Web site. The SDK here on alphaWorks contains IBM-authored material, including Library and Samples Source Code, IBM XL C/C++ Alpha Edition for Cell Broadband Engine Processor (a compiler), and IBM Full-System Simulator for the Cell Broadband Engine Processor. The Barcelona Supercomputing Center's Web site contains open-source projects that have been modified for Cell BE Processor; these include GNU GCC compilers for PPU and SPU, Linux Kernel 2.6.16, SPE Library support, NUMA support, and a system root image for the Full System Simulator.

For further information about the Cell BE SDK, please see the Cell BE SDK Installation and User's Guide.

How does it work?
The "cellsdk" installation script automatically downloads the required files from Barcelona Supercomputing Center. The ISO image, which can be burned to a CD, includes all of the IBM SDK material, installation script, and Cell BE documentation in one easy-to-use package.

IBM Cell BE SDK Version 1.1 contains a number of significant enhancements over Versions 1.0 and 1.0.1, and it completely replaces those previous SDKs. These enhancements include the following:

    * Linux kernel (2.6.16) and library support for Cell BE-based blade servers contain two Cell Broadband Engine Processors for a total of 16 SPEs.
    * PowerPC 64-bit hardware, such as Apple Power Mac G5 and IBM PowerPC, is a supported development platform.
    * C++ support has been added to the XL C compiler for PPU applications.
    * Support has been added for GDB server running in both PPEs and SPEs.
    * The GNU GCC compiler for PPU and SPU programs has been upgraded to Version 4.0.2.
    * Binutils have been upgraded to Version 2.16.1.
    * Additions and updates to the libraries and samples include a new sample that ray traces the quaternion Julia Set.
    * Added support for Non-Uniform Memory Access (NUMA) improves the performance of memory accesses between SPEs.
    * Improved installation uses a completely revamped process and RPMs.

The IBM Library and Samples Source Code package contains working examples and libraries that demonstrate programming techniques and performance of Cell BE Architecture. For example, a variety of application-oriented libraries, including Fast Fourier Transform (FFT), image, audio resample, math, game math, intrinsics, matrix operation, multi-precision math, noise generation, oscillator, surface, synchronization, and vector, are included in order to demonstrate the versatility of CBE architecture. Additional samples and workloads demonstrate how a programmer can exploit the on-chip computational capacity; included is a large FFT workload that showcases a performance that is more than an order of magnitude higher than a traditional processor.

For further information about the compiler, please see IBM XL C/C++ Alpha Edition for Cell Broadband Engine Processor.

For further information about the simulator, please see IBM Full-System Simulator for the Cell Broadband Engine Processor.

دانلود کیت طراحی نرم افزار برای cell و امولیتورش :cheesygri :

systemsim-cell-1.0.1-fc4_ppc32.tar.bz2 8022KB Tar file containing IBM Full-System Simulator for the Cell Broadband Engine Processor
systemsim-cell-1.0.1-fc4_x86.tar.bz2 7873KB Tar file containing IBM Full-System Simulator for the Cell Broadband Engine Processor
systemsim-cell-1.1-6.i386.rpm 8695KB IBM Full-System Simulator for the Cell Broadband Engine Processor for Intel
systemsim-cell-1.1-6.ppc.rpm 8890KB IBM Full-System Simulator for the Cell Broadband Engine Processor for PowerPC

دانلود : https://www14.software.ibm.com/webapp/iwm/web/preLogin.do?source=AW-0LO

راهنما :

http://www.alphaworks.ibm.com/tech/cellsystemsim

Code:

Version 1.1: SIMD performance improvements; better SPE code generation (branching code and code scheduling); improved results for float divide and double divide, including Inf and Nan handling.

What is the IBM XL C/C++ Alpha Edition for Cell Broadband Engine (Cell BE) Processor?

The IBM XL C/C++ Alpha Edition for Cell BE Processor is a cross-compiler that is tuned for the Cell Broadband Engine Architecture (CBEA). This C/C++ compiler, which is hosted on Fedora Core 5 for x86, generates code for the PowerPC Processor Element (PPE) or Synergistic Processor Element (SPE).

The compiler supports the revised 2003 International C++ Standard ISO/IEC 14882:2003(E), Programming Languages -- C++ and the ISO/IEC 9899:1999, Programming Languages -- C standard, also known as C99. It also supports the C89 Standard and K&R style of programming. In addition, the compiler supports numerous GNU Compiler Collection (GCC) C and C++ extensions in order to help users port their applications from GCC.

The IBM XL C/C++ Alpha Edition for Cell BE Processor is part of a family of IBM compilers that support C/C++ programming on IBM's pSeries®, iSeries®, and zSeries® platforms. Supported operating systems for IBM XL compiler family include AIX®, Linux®, z/OS®, z/VM®, and OS/400®. XL C/C++ Alpha Edition for Cell BE Processor uses the same compiler front end and optimization technologies as these commercially-available products.

How does it work?
The IBM XL C/C++ Alpha Edition for Cell BE Processor provides three invocation commands: ppuxlc, ppuxlc++, and spuxlc. The commands ppuxlc and ppuxlc++ are used to generate code for the PPE, and spuxlc is used to generate code for C on the SPE (C++ for SPE is not available in current version).

The compiler invocation commands for the PPE performs all necessary steps for compiling C/C++ source files by ppuxlc or ppuxlc++ into .o files and linking the object files and libraries by ppu-ld into an executable program. Similarly, the compiler invocation command for the SPE performs all necessary steps for compiling C/C++ source files by spuxlc into .s files, assembling .s files into .o files by spu-as, and linking the object files and libraries into an executable program by spu-ld. The Cell BE Software Development Kit also provides the tool ppu-embedspu for linking a PPE executable and a SPE executable into a single executable.

The compiler includes five base optimization levels:

    * -O0: almost no optimization
    * -O2: strong, low-level optimization that benefits most programs
    * -O3: intense, low-level optimization analysis with basic loop optimization
    * -O4: all of -O3 and detailed loop analysis and good whole-program analysis at link time
    * -O5: all of -O4 and detailed whole-program analysis at link time.

Auto-SIMDization is enabled at O3 -qhot or O4 and O5 by default for the PPE, and at O3 -qhot or O4 and O5. SIMD has been improved to better handle relatively aligned streams from run time-aligned individual streams. (SIMD stands for Single Instruction and Multiple Data.)

This technology is part of the Cell Broadband Engine Software Development Kit. For further information, please see the Cell BE SDK Installation and User's Guide.

Saman · Aug 22, 2006

میگن برنامه نویسیه ps3 سخته همینه !!!!

براي چاپ يه استرينگ چه مالياتي بايد بدي !! يدونه Cout بنداز تموم ميشه ديگه

به قول تو با پي سي بخواي بنويسي يه خطه.

به هر حال حتما يه چيزي داره كه به سخت بودنش بيارزه !

----

آرين آمار كدنويسي براي 360 رو هم ميتوني گير بياري ؟ ميگن اون خيلي ساده تره. بيار با پي سي مقايسه كنيم.

gamer220 · Aug 22, 2006

براي چاپ يه استرينگ چه مالياتي بايد بدي !! يدونه Cout بنداز تموم ميشه ديگه

به قول تو با پي سي بخواي بنويسي يه خطه.

به هر حال حتما يه چيزي داره كه به سخت بودنش بيارزه !

----

آرين آمار كدنويسي براي 360 رو هم ميتوني گير بياري ؟ ميگن اون خيلي ساده تره. بيار با پي سي مقايسه كنيم.

واقعا دردسر داره ، مشکل اصلیه برنامه نویسا کار کردن با SPE ها هست - فهموندنه یک استرینگ ساده انقدر مشکله ، ساخت بازی ....

! x360 دارای یک سری ابزار ویژوال هست ، سخت افزارشم کاملا از .net پشتیبانی میکنه ، همین XNA یک فریم ورکه و تحت .net ساخته شده ، برنامه نویسیشم با سی شارپ انجام شده . x360 دارای یک کتاب خونه ی وسیع و آماده برای تولید بازی به صورت RAD هست ! ( x360 نصب به ps3 مثل delphi در مقابل c میمونه ) تو ps3 ما هیچی نداریم - فقط یک سری ابزار آماده ( که حتما باید باشه ) مثل کتاب خونه ی opengl داریم . البته بازی سازی برای ps3 فقط وابسطه به کد نیست ، شما یک dev - kit داری و میتونی خیلی از کارای گرافیکی رو با هر ابزاری که راحتی انجام بدی . sdk ps3 رو از اون لینک خواستم دانلود کنم دیدم تحت لینوکسه و ترجیحا نسخه ی fedora core 5 ، ویندوز رو ساپرت نمی کنه . SDK 8 مگابایته

، SDK جاوا بدون ویرچول ماشین و کامپایلر 20 مگه ! ( با IDE netbeans !! )

http://msdn.microsoft.com/directx/xna/

فعلا چیزی پیدا نکردم ، فقط فهمیدم x360 پلتفرم .net و استاندارده c/c++ ansi/iso 2003 رو ساپورت میکنه ( کار با ابزار دات نت خیلیییییییی کارو راحت میکنه ، دات نت نصل بعد پلتفرم استاندارد مایکروسافته و جایگزین پلتفرم win32 ) ابزار استاندارد پورت شده برای بازی سازی زبون سی شارپ هست و فکر نکنم c++ تحت دات نت رو ساپورت کنه . یکی از دلایلی که مایکروسافت فقط ابزار دات نت رو برای بازی سازی قرار نداد ، ترس از از دست دادنه شرکتای بازی سازیه ، زبون های دات نت مثل سی شارپ همشوت سطح بالان و کارای سطح پایین نمیشه باهاشون کرد از اونجایی که بازی های بزرگ نیاز به تسلط کامل بر سخت افزار کنسول دارن تا بتونن هم با مفصر 3 بعدی کنترل کنند هم خودشون به سخت افزار کنترل کامل داشته باشن . مثلا gears of war ، با سی پی پی نوشته شده و دسترسیه کامل به سخت افزار داره ( معمولا در زبون های محدود از directx برای دسترسی به سخت افزار استفاده می کنن یا از api های آماده که عمدتا با سی نوشته میشه ) ، سی شارپ همچین قدرتی نداره ، این زبون یک زبون application نویسیه و هیچ پشتیبانی از برنامه نویسیه سیستمی نمیکنه ، بعضی از بازی ها لازم دارن هم دسترسیه نزدیک به ماشین مثل asm داشته باشن هم دسترسیه بالا در حد یوزر مثل pascal , csharp , basic و .... . در کل x360 خیلی به پی سی نزدیکه ، دلیل طرفداریه کارمک از x360 ساده بودن برنامه نویسی برای اون بدون آموزشه ( آموزش طولانی ، آموزشش در حد خوندن doc ها وقتی به مشکل بر میخوری ) ، ویندوزم خبرشو دارین 1000 تا ابزار ویژوال داره

، x360 = rad

http://editorials.teamxbox.com/xbox/1688/XNA-Game-Studio-Express-Power-to-the-Gamers/p1/

------------------

Terra Nova is not the usual place I go to get news around programming language improvements. But they linked to a great presentation from POPL 2006 by Tim Sweeney of Epic Games. Tim's talk is called The Next Mainstream Programming Language: A Game Developer's Perspective and it talks at great length the major issues facing game developers today. As Nate Combs at Terra Nova remarked, most of these issues are not specific to the game industry, but will likely be seen there first.

Most interesting (to me) was the issue of concurrency. Tim uses Gears of War for all his examples. Of course, Gears of War is an Xbox 360 exclusive. Xbox 360, as many of you probably know, has three hyper-threaded CPUs for a total capactiy of six hardware threads. Herb Sutter talked about this in his DDJ article The Free Lunch Is Over. Tim points out - rightly so - that "C++ is ill-equipped for concurrency". C#, Java and VB aren't much better. Tim conculdes that we'll need a combination of effects-free non-imperative code (which can safely be executed in parallel) and software transactional memory (to manage parallel modifications to system state).

Tim also touches on topics of performance, modularity and reliability. And he has an eye on the practical at all times. For example, he points out that even a four times performance overhead of software transactional memory is acceptable, if it allows the code to scale to many threads.

Anyway, it's a great read so check it out. Also, MS Research has a software transactional memory project you can download if you're so inclined.

------------------------------

http://forums.xbox.com/1/6075685/ShowPost.aspx

زبون d هم خیلی خفنه ها :d ، اگه پشتیبانی بشه ، سال 99 توسط والتر برایت ساخته شد ، ps4 حتما ازش پشتیبانی میکنه

kaveh007 · Aug 24, 2006

با این توصیفاتی که آریان جان دادی پس باید ساختن بازی برای ps3 کار حضرت فیل باشه :cheesygri ولی با تمام این اوصاف بازی سازان ژاپنی بازی هایی با کیفیت بالا برای سونی می سازن تا روی مایکرو سافت رو با 360 کم کنند :-"

pixolator · Feb 1, 2007

خیلی جالب بودش...ممنون

test84 · Feb 1, 2007

آره، بديش اينه فرق زيادي بين بازي هاي سوني و ديگر شركت ها مشاهده ميشه، نخ سوزن بازي هايي كه براي ديگر كنسول ها هم مياد.

جستجو

CELL processor architectures programming guide

گزینه‌های بیشتر

gamer220

gamer220

gamer220

gamer220

gamer220

gamer220

pandora tomorrow

True Blue

gamer220

Saman

gamer220

kaveh007

pixolator

test84

کاربرانی که این گفتگو را مشاهده می‌کنند

تبلیغات متنی