Kamis, 03 Juni 2010
Intel® 64 and IA-32 Architectures Optimization Reference Manual
Intel 64 processors. The coding rules and code optimization techniques listed target the Intel Core microarchitecture, the Intel NetBurst microarchitecture and the Pentium M processor microarchitecture. In most cases, coding rules apply to software running in 64-bit mode of Intel 64 architecture, compatibility mode of Intel 64 architecture, and IA-32 modes (IA-32 modes are supported in IA-32 and Intel 64 architectures). Coding rules specific to 64-bit modes are noted separately.
TUNING YOUR APPLICATION
Tuning an application for high performance on any Intel 64 or IA-32 processor
requires understanding and basic skills in:
• Intel 64 and IA-32 architecture
• C and Assembly language
• hot-spot regions in the application that have impact on performance
• optimization capabilities of the compiler
• techniques used to evaluate application performance
The Intel® VTune™ Performance Analyzer can help you analyze and locate hot-spot regions in your applications. On the Intel® Core™ i7, Intel® Core™2 Duo, Intel® Core™ Duo, Intel® Core™ Solo, Pentium® 4, Intel® Xeon® and Pentium® M processors, this tool can monitor an application through a selection of performance monitoring events and analyze the performance event data that is gathered during code execution. This manual also describes information that can be gathered using the performance counters through Pentium 4 processor’s performance monitoring events.
ABOUT THIS MANUAL
The Intel® Xeon® processor 3000, 3200, 5100, 5300, 7200 and 7300 series, Intel® Pentium® dual-core, Intel® Core™2 Duo, Intel® Core™2 Quad, and Intel® Core™2 Extreme processors are based on Intel® CoreTM microarchitecture. In this document, references to the Core 2 Duo processor refer to processors based on the Intel® Core™ microarchitecture. The Intel® Xeon® processor 3100, 3300, 5200, 5400, 7400 series, Intel® Core™2 Quad processor Q8000 series, and Intel® Core™2 Extreme processors QX9000 series are based on 45nm Enhanced Intel® Core™microarchitecture. The Intel® Core™ i7 processor and Intel® Xeon® processor 5500 are based on 45 nm Intel® Microarchitecture (Nehalem). In this document, references to the Pentium 4 processor refer to processors based on the Intel NetBurst® microarchitecture. This includes the Intel Pentium 4 processor and many Intel Xeon processors based on Intel NetBurst microarchitecture. Where appropriate, differences are noted (for example, some Intel Xeon processors have third level cache). The Dual-core Intel® Xeon® processor LV is based on the same architecture as Intel® Core™ Duo and Intel® Core™ Solo processors. Intel® Atom™ processor is based on Intel® Atom™ microarchitecture. The following bullets summarize chapters in this manual.
• Chapter 1: Introduction — Defines the purpose and outlines the contents of this manual.
• Chapter 2: Intel® 64 and IA-32 Processor Architectures — Describes the microarchitecture of recent IA-32 and Intel 64 processor families, and other
features relevant to software optimization.
• Chapter 3: General Optimization Guidelines — Describes general code
development and optimization techniques that apply to all applications designed
to take advantage of the common features of the Intel Core microarchitecture,
Enhanced Intel Core microarchitecture, Intel NetBurst microarchitecture and
Pentium M processor microarchitecture.
• Chapter 4: Coding for SIMD Architectures — Describes techniques and
concepts for using the SIMD integer and SIMD floating-point instructions
provided by the MMX™ technology, Streaming SIMD Extensions, Streaming
SIMD Extensions 2, Streaming SIMD Extensions 3, SSSE3, and SSE4.1.
• Chapter 5: Optimizing for SIMD Integer Applications — Provides optimization
suggestions and common building blocks for applications that use the 128-
bit SIMD integer instructions.
• Chapter 6: Optimizing for SIMD Floating-point Applications — Provides
optimization suggestions and common building blocks for applications that use
the single-precision and double-precision SIMD floating-point instructions.
• Chapter 7: Optimizing Cache Usage — Describes how to use the PREFETCH
instruction, cache control management instructions to optimize cache usage, and
the deterministic cache parameters.
• Chapter 8: Multiprocessor and Hyper-Threading Technology — Describes
guidelines and techniques for optimizing multithreaded applications to achieve
optimal performance scaling. Use these when targeting multicore processor,
processors supporting Hyper-Threading Technology, or multiprocessor (MP)
systems.
• Chapter 9: 64-Bit Mode Coding Guidelines — This chapter describes a set of
additional coding guidelines for application software written to run in 64-bit
mode.
• Chapter 10: SSE4.2 and SIMD Programming for Text-
Processing/Lexing/Parsing— Describes SIMD techniques of using SSE4.2
along with other instruction extensions to improve text/string processing and
lexing/parsing applications.
• Chapter 11: Power Optimization for Mobile Usages — This chapter provides
background on power saving techniques in mobile processors and makes recommendations
that developers can leverage to provide longer battery life.
• Chapter 12: Intel® Atom™ Processor Architecture and Optimization —
Describes the microarchitecture of processor families based on Intel Atom
microarchitecture, and software optimization techniques targeting Intel Atom
microarchitecture.
• Appendix A: Application Performance Tools — Introduces tools for analyzing
and enhancing application performance without having to write assembly code.
• Appendix B: Intel® Pentium® 4 Processor Performance Metrics —
Provides information that can be gathered using Pentium 4 processor’s
performance monitoring events. These performance metrics can help
programmers determine how effectively an application is using the features of
the Intel NetBurst microarchitecture.
• Appendix C: IA-32 Instruction Latency and Throughput — Provides latency
and throughput data for the IA-32 instructions. Instruction timing data specific to
recent processor families are provided.
• Appendix D: Stack Alignment — Describes stack alignment conventions and
techniques to optimize performance of accessing stack-based data.
• Appendix E: Summary of Rules and Suggestions — Summarizes the rules and tuning suggestions referenced in the manual.
RELATED INFORMATION
For more information on the Intel® architecture, techniques, and the processor
architecture terminology, the following are of particular interest:
• Intel® 64 and IA-32 Architectures Software Developer’s Manual (in five volumes)
• Intel® Processor Identification with the CPUID Instruction, AP-485
• Developing Multi-threaded Applications: A Platform Consistent Approach
• Intel® C++ Compiler documentation and online help
• Intel® Fortran Compiler documentation and online help
• Intel® VTune™ Performance Analyzer documentation and online help
• Using Spin-Loops on Intel Pentium 4 Processor and Intel Xeon Processor MP
Sabtu, 20 Maret 2010
KONSEP PERKEMBANGAN KOMPUTER ENIAC DAN VON NEUMANN
ENIAC, Komputer Pertama Dunia
ENIAC, kependekan dari Electronic Numerical Integrator And Computer adalah perangkat elektronik digital pertama yang bekerja sebagai komputer. Perangkat ini selesai dibuat oleh Angkatan bersenjata Amerika Serikat pada tahun 1945 dan diumumkan ke publik di tahun 1946. Ketika itu, komputer tersebut ditujukan untuk menghitung arah dan jarak tembak rudal balistik di Perang Dunia ke II.Pemicu dibuatnya ENIAC adalah kebutuhan atas alat untuk membantu mempermudah sebuah negara ketika sedang berperang. Sama seperti inovasi lainnya, ENIAC dibangun berdasarkan tigak konsep dan teknologi yang sudah tersedia saat itu yakni “otak mekanik”, tabung hampa udara dan punch card (kertas yang berlubang di posisi tertentu yang menyimpan informasi). Ketiga teknologi itu coba digabung oleh Professor John Mauchly, seorang dosen fisika dari
ENIAC yang memiliki bobot seberat 30 ton, menggunakan daya listrik sebesar 200 kilowatt, terdiri dari 19.000 tabung hampa udara, 1500 relay, serta ratusan ribu resistor, kapasitor, dan induktor.Selain untuk berperang, ENIAC juga dapat digunakan untuk memprediksi cuaca, menghitung energi atom, sinar kosmik, pengukuran suhu, penelitian angka acak, mendesain saluran udara, dan penggunaan ilmiah lainnya. ENIAC yang menjadi basis komputer masa kini tersebut juga dapat menjumlah, mengurangi, mengali, dan membagi serta dapat menyimpan hingga sebanyak 20 data 10 digit angka desimal. Perangkat penghitungan yang digunakan juga berfungsi sebagai unit penyimpanan. Komputer yang mendapat julukan “Otak Raksasa” itu mampu menghitung seribu kali lebih cepat dibandingkan dengan mesin hitung elektronik yang ada saat itu. Pada perjalanannya, ENIAC cukup rumit untuk ditangani. Sebagai contoh, ketika terjadi kerusakan pada salah satu tabung kedap udara yang terpasang di ENIAC, teknisi harus memeriksa keseluruh 19 ribu buah tabung untuk mencari tabung mana yang tidak berfungsi.Akhirnya, karena kebutuhan atas mesin hitung yang lebih cepat dan efisien makin mendesak, pada 2 Oktober 1955, ENIAC berhenti digunakan. Saat ini, empat dari total empat puluh bagian panel ENIAC disimpan di
KONSEP KOMPUTER VON NEUMANN
John von Neumann (1903-1957) adalah ilmuan yang meletakkan dasar-dasar komputer modern. Dalam hidupnya yang singkat, Von Neumann telah menjadi ilmuwan besar abad 21. Von Neumann meningkatkan karya-karyanya dalam bidang matematika, teori kuantum, game theory, fisika nuklir, dan ilmu komputer. Beliau juga merupakan salah seorang ilmuwan yang sangat berpengaruh dalam pembuatan bom atom di