| Q&A |
Ed
Plowman,
ARM Mali Product Manager
March 2007 |
| |
 |
"OpenGL
ES 2.0 introduces a new programming paradigm by making
parts of the graphics pipeline programmable via shaders.
Shaders are small programs written in GLSL ES a ‘C’
like language which can be used to manipulate vertex
and fragment data. This gives the application developer
much finer grain control opening up the possibility
to create much more complex rendering effects. With
the introduction of OpenGL ES 2.0 acceleration we move
into the realms of Next Gen console image synthesis."
<
Falanx/Mali |
|
| Q1 |
We’ve
been talking of 3D on mobile phones for a few years. There have
been lots of announcements of that kind at 3GSM. Will we see
soon a massive implementation of 3D on embedded equipments? |
| A1 |
Since
3GSM spans the whole value chain from fundamental technology
to the actual handsets and everything in between, it takes a
while for these things to filter through from one end of the
value chain to the other. We are now at the stage where quite
a large number of handsets are equipped with hardware acceleration
and key APIs such as M3G and OpenGL ES, so content creators
can start taking advantage of the hardware that’s available.
Whilst the graphically accelerated handsets initially released
over the last year or so have been fairly high end, there is
a clear move towards bringing these handsets into the mass market.
For example the W950 from Sony Ericsson which is squarely targeted
at the consumer market. It’s also clear from some of the
chip announcements made at this year’s show that the trend
of including graphics acceleration will continue. So you can
expect to see an increasing number of handset with graphics
acceleration appear over the next 12-18months, with the truly
mass market explosion happening in the 18 month-2 year time
frame. You can also expect to see the effects of graphics acceleration
in mobile propagate beyond gaming and into GUIs to create much
richer user experiences (I think Apple has amply pointed the
way here with I-Phone) and this is likely to drive mass market
adoption in the short term.
ARM has products ideal placed to suit both the high end demand
for console quality gaming using the Mali200, and greatly enhance
the user experience in mass market sector for both GUI and gaming
with the compact, but powerful Mali55. |
| |
|
| Q2 |
Do
ARM11-type processors fasten 3D processing? |
| A2 |
In
all graphics application you need to strike a balance between
the CPU and the graphics processing unit (GPU). For more demanding
graphics applications (gaming in particular) the more CPU mips
you have, the better the experience the content provider can
create on the application side by creating more realistic interaction
with the virtual world (for example more complex physics engines).
ARM11 with the ARM v6 instructions set can greatly increase
the amount of processing capacity you have to play with, particularly
if its equipped with the VFP11 (the ARM11 version of ARM’s
Vector Floating Point Unit). The v6 instruction set features
enhanced CPI, extensions for integer SIMD, and micro-architecture
to accommodate faster clock speeds. |
| |
|
| Q3 |
The
market is essentially dominated by ARM9. Which performance level
can we expect on mobile phones equipped wit this chip? |
| A3 |
ARM9
with a well written software rendering engine can produce a
very good level of graphics with a peak triangle rate of somewhere
between 100K-150K polys/sec, which is ideal for an entry-level
graphics-enabled product. By introducing a GPU, however, you
can really boost performance and image quality. For example
an ARM9 used in combination with a GPU like Mali55 can produce
up to 1M polys/sec and at only 1.4mm2 in 90G, Mali55 offers
an excellent balance of PPA. |
| |
|
| Q4 |
There
are several different approaches on the 3D segment: dedicated
chips (nvidia or AMD) or 3D instructions at the central processor
level. What are the main strengths and weaknesses of both? |
| A4 |
As
you say, each approach has its pros and cons, but depending
on your requirements there is a solution which can fit for
each requirement. Augmented instruction sets (SIMD units,
etc.) can provide a good level of flexibility when targeting
graphics rendering. The Cortex A8’s Neon unit is a prime
example of this, but when pushed to support high end features
(such as advanced texture filtering, FSAA, etc.) it becomes
constrained by the CPU’s memory interface, which naturally
is optimized for the access patterns of general purpose processing,
but not graphics processing.
The adjunct GPU approach allows for easy addition to an existing
system, but suffers from having a relatively low bandwidth
connection to the main CPU. This creates a performance bottle
neck for these systems as all the data (textures, geometry,
etc.) used by the GPU will have to be squeezed over this port.
These ports also generally operate on a “push”
model requiring all the data to be “pushed” to
the I/O port by the CPU, tying up valuable CPU cycles that
could be spent on an application.
To ease this problem, adjunct GPUs often contain embedded
RAM (DRAM or SRAM) which the GPU uses as for caching data,
local processing space and for storage of the frame buffer.
This approach has a few drawbacks however, as it prevents
the efficient sharing of memory resource (you basically have
a relatively large block of memory that can only be used by
the GPU) and it also prevents the ability to mix rendering
from CPU and GPU (a technique common in MIDP2 implementations).
For the mobile industry the preferred choice still remains
integrating the GPU into the SoC and using UMA memory architecture.
The Mali range of GPU’s has been designed with the constraints
of these environments in mind.
|
| |
|
| Q5 |
In
regard of the autonomy of embedded equipments, isn’t a
hardware 3D acceleration too costly? In concrete terms, how
long can we “play in 3D” before needing to recharge
one’s mobile phone? |
| A5 |
If
you look at the Nintendo DS, one of the most popular handheld
devices with 3D capability (which incidentally also uses ARM
CPU’s), it has 12-13 hours of playtime from a single charge,
so they have set the upper bar for battery life on a gaming
device. Obviously this is a dedicated device and their are a
lot of other things going on in a mobile handset other than
just gaming and all of those things take power, so shooting
for 12+ hours maybe a longer term goal. To help achieve this
goal ARM is applying its expertise in delivering IP with class
leading PPA to provide longer playing times for handsets and
also to increase the power efficiency of the enhanced user interfaces
currently being deployed.
By far the biggest impact of introducing 3D graphics into a
system from a power consumption point of view is the extra memory
traffic. External memory accesses carry very high power consumption
cost (figures as high as 10x that of internal power costs are
not unheard of), so having an architecture that minimizes the
number of external accesses is essential for longer playtimes.
The Mali architecture uses an approach specifically designed
to reduce the number of external accesses by optimization of
the data structures used, effective caching strategies, and
use of internal Z and colour buffers.
As the use cases for graphics in mobile are many an varied ARM
is also applying its Intelligent Energy Management (IEM) technology
which implements advanced algorithms to optimally balance workload
and energy consumption, whilst maximizing system responsiveness
to meet end-user performance expectations. Using a technique
called Dynamic Voltage and Frequency Scaling (DFVS) to implement
the power and energy savings, the end-user should see no perceived
reduction in their experience, whilst at the same time battery
life will be increased. IEM works with the operating system
and applications running on the mobile phone to dynamically
adjust the required performance level through a standard programmer's
model. This is perfect for mobile where the system has fairly
diverse demand such as going from a full blown 3D game with
a high performance requirement to a GUI use case, which will
generally have a much lower performance requirement when the
handset has an incoming call. |
| |
|
| Q6 |
Is
Java (JSR 189-135) a good means of programming 3D contents for
embedded equipments? What are the differences between Jazelle
and the execution of “Just In Time” Java code? ARM
presented Mali, a 3D accelerator range. What are the main advantages
of those chips? |
| A6 |
Today,
JSR-184 is a very good API for developers to create 3D content.
Java is the most popular execution environment made available
to developers and for them, there are few options available
for producing 3D content. The developer can try to create
3D graphics using the 2D API available in MIDP, but this will
be a very slow process and the graphics will look quite simple,
or the developer can use one of 3 APIs: JSR-184, JSR-239 or
MascotV3.
* JSR-184 is a widely deployed API shipping in hundreds of
different handsets from all of the top handset manufacturers,
so it gives a good customer base for developers to sell their
games to. There is some porting required between each phone,
but the effort to port a JSR-184 game can be less than the
effort to port a 2D game because the graphics will automatically
scale to the correct screen size. For example, content developed
for Nokia phones that use Nokia's own JSR-184 implementation
can be run on Motorola phones that use ARM's Swerve Client
JSR-184 implementation and the 3D graphics will require no
modification. Hardware acceleration is not required for JSR-184
devices, but many phones do now ship with an OpenGL ES 1.x
hardware accelerator such as the Mali55, and the Swerve Client
will make use of this to allow the developer to use much richer
graphics and achieve a high frame rate.
* JSR-239 is quite a new API which is designed to allow programmers
to use the OpenGL ES API from Java applications. There are
no phones using JSR-239 and there is no content that uses
JSR-239 except for one benchmark so it is too early to consider
JSR-239 as an established standard. Judging from the uptake
of JSR-184 it will take about 2 years for JSR-239 to become
widespread.
* MascotV3 is a proprietary API that is used in Japan. Developers
should use MascotV3 if they are planning to sell their content
in Japan, but otherwise JSR-184 will give the developer a
larger customer base.
In the future as native execution environments replace Java
we will see the native graphics APIs such as OpenGL ES and
OpenVG being used instead of the Java APIs, although it is
likely that for some classes of games developers will continue
to use a scene-graph engine such as ARM's Swerve Client in
order to speed the development time and to enable them to
easily port the games to those devices that still use Java.
The Java platform is increasingly being used to deliver services
beyond 2D and 3D games such as push email, instant messaging
and music players. The problem with Java is that it must be
translated (from bytecode) into the language of the underlying
hardware (ARM) at runtime, when the phone is executing the
application. This often results in unacceptable performance.
There are two basic ways to speed up performance of Java that
are in use today:
1. Software only techniques, such as Just-In-Time (JIT) compilation.
This requires a Java-to-ARM compiler running on the device,
which compiles the Java bytecode to ARM instructions. The
downside here is around a 25% increase in static memory size
due to the compiler, a 3 to 8x increase in application size
due to the expansion of bytecode when it is compiled to ARM
code and often unacceptable pauses caused by the running of
the compiler during game play.
2. Hardware acceleration, such as Jazelle DBX, where the majority
of Java bytecode is executed directly in the processor core.
The ARM core treats Java bytecode instructions as the native
instruction set of the processor, so achieving near-native
performance with no memory overhead and smooth, consistent
application behaviour.
|
| |
|
| Q7 |
From
a 3D performance standpoint, can we make a comparison with other
equipments (PSP, PS2, PC)? |
| A7 |
In
terms of graphics performance and quality the first generation
of handsets with 3D acceleration capability are similar to Nintendo
DS or PSP (supporting OpenGL ES 1.0 and 1.1). In the 2nd generation
(appearing in 12-18 months) we expect to reach roughly similar
levels of image quality to PS2 or Gamecube. The 3rd generation
is probably the most interesting as we will see the introduction
of OpenGL ES 2.0 capable acceleration.
OpenGL ES 2.0 introduces a new programming paradigm by making
parts of the graphics pipeline programmable via shaders. Shaders
are small programs written in GLSL ES a ‘C’ like
language which can be used to manipulate vertex and fragment
data. This gives the application developer much finer grain
control opening up the possibility to create much more complex
rendering effects. With the introduction of OpenGL ES 2.0 acceleration
we move into the realms of Next Gen console image synthesis. |
| |
|
| |
|
 |
|
|