The Mac's Move to Intel
Dr. Dobb's Journal October, 2005
Migrating applications to the new hardware platform
By Tom Thompson
Tom is a technical writer providing support for J2ME wireless
technologies at KPI Consulting. He can be contacted at
tom_thompson@lycos.com.
A Portable Operating System
Apple Computer's CEO Steve Jobs clearly dropped a bombshell when he
told software developers at Apple's World Wide Developer's Conference
(WWDC) that the Macintosh computer platform was going to switch from
PowerPC to Intel x86 processors. He added, of course, that there would
be time to adapt and manage the change, because Macs with Intel
processors would be phased in over a two-year period.
Mac old-timers will recall that Apple accomplished a similar major
processor transition in 1994. Back then, the switch was from Motorola's
68K processors to Motorola/IBM's PowerPC. For software developers, the
transition's difficulty depended upon whether their programs used
high-level code exclusively, or accessed lower-level system services.
In the latter case, it required that developers make fundamental
changes in how they wrote their code; some of these changes I helped
document in the book The Power Macintosh Programmer's Guide (Hayden, 1994). As a consequence, Apple has ample prior experience to guide its migration to the x86 processor.
Pundits and others have already debated the wisdom and reasons for
making the processor switch. The decision has been made; it's not worth
repeating those arguments here. Seasoned Mac programmers are
determining how painful and how expensive the transition is going to
be. In this article, I examine the migration plan, and describe
pitfalls you should be aware of.
Infrastructure Support
Migrating an application to the new hardware platform requires that
there be a certain amount of infrastructure in place to support its
development and execution. There are several key technical pillars that
comprise this infrastructure. There must be:
- A native version of the operating system available to provide the needed system services.
- A
mechanism to package an application's code for distribution and
execution on disparate processors during the transition period. This
scheme must be transparent to less tech-savvy users, or else you
frustrate them when the application won't run because it's on the wrong
platform.
- Tools that translate the application's source code into the platform's native code.
At the WWDC announcement, Jobs revealed that, since its introduction
in 2001, every PowerPC release of Mac OS X has had an x86-based
DopplegŠnger—a separate Intel version of the OS was quietly developed
and maintained (see the accompanying text box entitled "A Portable
Operating System," for how Mac OS X's design allowed this). Mac OS X
10.4, (aka Tiger), which was released this April for the PowerPC Macs,
is slated to be the preliminary x86-based OS release. In short, the
first infrastructure pillar is therefore already in place and has been
tested for years.
The core of the Mac x86 platform's distribution mechanism is the
universal binary file for Mac OS X applications. A universal binary
carries two versions of the application—a version implemented as
PowerPC machine code, and a version implemented in x86 machine code
that is stored in a single, larger executable file. However, the
application's GUI elements—TIFF images of buttons or other visual
controls—can be shared between the two versions. Sharing these GUI
elements, known as "resources," helps keep the universal binary
application's size manageable.
The universal binary scheme is an extension of Mac OS X's Mach-O
executable files (see http://developer.apple.com/
documentation/DeveloperTools/Conceptual/
MachORuntime/FileStructure/chapter_2.1_section_7.html#//apple_ref/doc/uid/
TP40000895-//apple_ref/doc/uid/20001298-154889-CJBCFJGH). Universal
binary files consist of a simple archive that contains multiple Mach-O
files; see Figure 1.
Each Mach-O file contains contiguous bytes of an application's code and
data for one processor type. Special file headers enable the platform's
runtime to quickly locate the appropriate Mach-O image in the file. Listing One
shows how this is done. The "fat" header identifies the file as a
universal binary and specifies the number of "fat" architecture
structures that follow it. Immediately past the fat header, each fat
architecture data structure references the code for a different
processor type in the file. These architecture structures supply the
runtime with the CPU type, an offset to the start of the embedded
Mach-O file within this file, and its size. When a universal binary
application is launched, the operating system uses the headers to
automatically locate, load, and execute the Mach-O file of the
application that matches the Mac's processor type.
Universal binaries thus form the second support pillar, the
distribution mechanism. Typical users are unaware of the dual sets of
code packaged in the file, and they can copy the application by simply
dragging and dropping it. When they launch the application, the Mac OS
X runtime performs a sleight-of-hand that loads the appropriate
application code from the universal binary file, then executes it. No
matter what Mac it's installed on, a universal binary application thus
executes at native speeds on either PowerPC- or x86-based systems.
Mac old-timers will recognize the universal binary scheme as an echo
of the "fat" binary file format that handled software distribution
during the Mac platform's transition to the PowerPC. Fat binaries
consisted of two versions of the application—68K and PowerPC—and the
Mac OS determined which version to load and run. The fat binary
distribution scheme worked very well, and based on its success, I have
high expectations that the universal binary scheme will work, too.
The third pillar, the development tools necessary to generate x86
code for the x86 Mac platform, is represented by Apple's Xcode 2.1
development tool set. Xcode consists of an IDE with GUI-based tools
such as a source-code editor and debugger. It uses GCC 4 compilers to
generate x86 machine code from C, C++, Objective-C, and Objective-C++
source code. Source-level debugging is possible through the use of the
standard GDB tool. For x86 development, you'll need to install Xcode
2.1, along with the 10.4 Universal SDK. This SDK contains the APIs and
header files that enable you to generate PowerPC code, x86 code, and
universal binaries. Generating a universal binary becomes just a matter
of selecting both PPC and Intel processors for code generation in
Xcode's controls, and building the program. Metrowerks, whose
CodeWarrior toolset helped Apple get through the 68K/PowerPC
transition, will not be participating in this transition. The company
has sold its x86 compiler and linker to a third party, and thus, the
CodeWarrior toolset can't generate universal binaries.
For a limited time, Apple offered a Developer Transition Kit that
contained the Xcode tools and universal SDK. For actual testing on the
target platform, the kit also had a preliminary x86 hardware platform
with a 3.6-GHz Pentium 4 processor, running a preview release of Mac OS
X 4.1 for Intel.
Code Casualties
Apple has laid a solid foundation for making the migration possible.
However, any programmer who's done a code port regards this plan with a
healthy skepticism, because the infrastructure is still preliminary in
some areas. More important, not all applications will be easy to port,
and some applications will be left behind, due to design issues and
costs. Let's see if we can't draw up a triage list of which
applications are most likely to survive the transition.
First and foremost, any application ported to the x86 Mac platform
must be a Mac OS X application. Fortunately, Mac OS X provides a wealth
of different APIs for writing and migrating applications—there's
Carbon, Cocoa, Java, and BSD UNIX. Table 1 provides a brief summary of the APIs that Mac OS X offers.
To start the triage list, it should be obvious that if you're
writing a kernel extension, driver, or other low-level system service
that requires intimate knowledge of the kernel plumbing or processor
architecture, you've got a code rewrite ahead of you, no matter what
API you use.
Mac OS 8/9 applications won't survive the transition unless they're
ported to the Carbon API. Furthermore, existing Carbon apps that use
the PowerPC-based Preferred Executable Format (PEF) will have to be
rebuilt with Xcode 2.1 for conversion to the Mach-O executable format.
The reason is that Mac OS X uses the dyld runtime, which is the native
execution environment for both PowerPC and Intel Mac platforms. The
dyld runtime uses the Mach-O format for its executable files, and as
we've already learned, universal binaries rely on the Mach-O format to
package PowerPC and x86 binary images.
Applications that use common system services should port easily.
However, caveats abound. For example, how the two processors store data
in memory can cause all sorts of problems even for simple applications.
Architecture's Impact
High-level application frameworks hide the gritty hardware details
from developers to improve code portability and stability. When the
platform's processor changes, fundamental differences in hardware
behavior can ripple up through these frameworks and hurt you. Let's
take a look at two of these differences and see how they affect porting
a PowerPC application to the x86 platform.
One such problem is known as the "Endian issue" and occurs because
of how the PowerPC and Intel processors arrange the byte order of data
in memory. The PowerPC processor is Big endian in that it stores a data
value's MSB in the lowest byte address, while the Intel is Little
endian because it places the value's LSB in the lowest byte address.
Normally, the Endian issue doesn't rear its ugly head unless you access
multibyte variables through overlapping unions, use a constant as an
offset into data structure, or use bitfields larger than a byte. In
these situations, the position of bytes within the variable matter.
Although the program's code executes flawlessly, the retrieved data is
garbled due to where the processor placed the bytes in memory, and
spurious results occur. To fix this problem, reference data in
structures using field names and not offsets, and be prepared to swap
bytes if necessary.
The Endian issue manifests itself another way when an application
accesses data piecewise as bytes and then reassembles it into larger
data variables or structures. This occurs when an application reads
data from a file or through a network. The bad news is that any
application that performs disk I/O on binary data (such as 16-bit audio
stream), or uses network communications (such as e-mail or web
browser), can be plagued by this problem. The good news is that each
Mac OS X API provides a slew of methods that can perform the byte
swapping for you. Consult the Universal Binary Programming Guidelines (http://developer.apple.com/documentation/MacOSX/Conceptual/ universal_binary/universal_binary.pdf) from Apple for details.
Another potential trap manifested by the Endian issue is if your Mac
application uses custom resources. Mac OS X understands the structure
of its own resources and will automatically perform any byte-swapping
if required. However, for a custom resource whose contents are unknown
to the OS, you will have to write a byte-swapping routine for it. Those
applications that use CodeWarrior's PowerPlant framework require a
byte-swapping routine to swap the custom PPob resources that this framework uses. Appendix E in the Universal Binary Programming Guidelines document has some example code that shows how to swap PPob resources, and this code serves as a guideline on how to write other byte-swapping routines.
It's In the Vector
Another major processor architectural issue is for those
applications that make heavy use of the PowerPC's AltiVec instructions
for scientific computing and video editing—some of the Mac's
bread-and-butter applications. (AltiVec is a floating-point and integer
SIMD instruction set referred to as "AltiVec" by Motorola, "VMX" by
IBM, and "Velocity Engine" by Apple; see
http://developer.apple.com/hardware/ ve/.) AltiVec consists of more
than 160 special-purpose instructions that operate on vectors held in
32 128-bit data registers. A 128-bit vector may be composed of multiple
elements, such as four 32-bit integers, eight 16-bit integers, or four
32-bit floats. The AltiVec instructions can perform a variety of Single
Instruction Multiple Data (SIMD) arithmetic operations (multiply, multiply-add, and others) on these elements in parallel, yielding high-throughput data processing.
Applications relying on AltiVec instructions must be rewritten to
use Intel's SIMD instructions, either its Multimedia extensions (MMX)
instructions or its Streaming SIMD Extensions (SSE) instructions. There
are several flavors of SSE instructions (SSE, SSE2, and SSE3) and they
work on eight 128-bit data registers.
The ideal solution for this problem is to use Cocoa's Accelerate
Framework. It implements vector operations transparent of the
underlying hardware. An application that uses the Accelerate Framework
can operate without modification on both Mac platforms. This framework
provides a ready-made set of optimized algorithms for image processing,
along with DSP algorithms for video processing, medical imaging, and
other data-manipulation functions.
If you must port your AltiVec code to the x86 SSE instructions, on
the plus side, Intel provides a high-level C interface that simplifies
the use of these instructions. Another major plus is that you've
already "vectorized" your high-level code for use with AltiVec, and
these modifications apply to using SSE instructions as well. That is,
you should have unrolled code loops and modified the data structures
they reference to take advantage of the SIMD instruction's parallel
processing capabilities.
The big negative to porting to SSE is that the rest of your code
will need to be heavily revised due to the differences between the
AltiVec and SSE instructions. For example, there's no direct
correspondence in behavior between the AltiVec and x86 permute instructions. The magnitude of the shift performed by the AltiVec permute operation can be changed at runtime, while the x86 permute requires the magnitude be set at compile time. This makes it difficult for the x86 permute
to clean up misaligned data accesses, especially for use with the SSE
instructions themselves. In general, AltiVec instructions that execute
on the vector complex integer unit (such as the multiply-accumulate
operations) have no direct counterparts in the SSE instruction set, and
these portions of the vector code will need the most work.
Returning to the triage list, even applications written in Cocoa and
Carbon aren't immune to certain processor issues. Applications that do
any file or network I/O will have to be examined and modified, due to
the Endian issue. Even mundane applications that make use of special
data structure will need to be checked carefully. Those applications
that make use of AltiVec will have to be completely rewritten, either
to the Accelerate Framework or to SSE3 instructions. Whether they
survive the transition depends on how much it will cost to correct
these issues.
A Real-World Example
How well will this transition go? Some early developer reports
pegged the initial porting process at taking anywhere from 15 minutes
to 24 hours, depending upon how well Apple's guidelines were followed
when the application was written. Those developers whose applications
were written in Cocoa usually experienced the least difficulty, which
shouldn't come as a surprise because Cocoa was engineered from the
ground up as an application framework for NeXTSTEP, which became Mac OS
X.
Bare Bones Software's port of BBEdit, its industrial-strength
programming editor, offers an interesting glimpse of the process
(http://www.barebones.com/products/ bbedit/). Portions of BBEdit were
written in C++ and use the Carbon API, while others portions were
written in Objective-C and use the Cocoa API. It only took 24 hours to
get BBEdit running on the Mac x86 platform. It helped that the files
BBEdit works with—ASCII text files that consist of byte data—were
Endian neutral.
However, BBEdit's developers emphasize that although they got the
program running quickly, getting every feature to work reliably took
another several weeks of work, especially for testing to ensure that
the features worked reliably in the new environment. Still, considering
that the application was executing on a completely different platform
in a short amount of time and without requiring a total code rewrite,
this augers well for many Mac OS X applications making the transition.
In the end, time and developers will show us how well Apple managed the
transition to the Intel x86 processor.
DDJ
Listing One
#define FAT_MAGIC 0xcafebabe
#define FAT_CIGAM 0xbebafeca /* NXSwapLong(FAT_MAGIC) */
struct fat_header {
uint32_t magic; /* FAT_MAGIC */
uint32_t nfat_arch; /* number of structs that follow */
};
struct fat_arch {
cpu_type_t cputype; /* cpu specifier (int) */
cpu_subtype_t cpusubtype; /* machine specifier (int) */
uint32_t offset; /* file offset to this object file */
uint32_t size; /* size of this object file */
uint32_t align; /* alignment as a power of 2 */
};
Back to article
José Antonio Álvarez Bermejo.
- - - disclaimer - - -
Las opiniones son como el olor corporal, a cada uno le gusta el suyo. No te esfuerces en oponerte a mi opinión por que esa es tú opinión y a mi me trae sin cuidado.