


default search action
CGO 2015: San Francisco, CA, USA
- Kunle Olukotun, Aaron Smith, Robert Hundt, Jason Mars:

Proceedings of the 13th Annual IEEE/ACM International Symposium on Code Generation and Optimization, CGO 2015, San Francisco, CA, USA, February 07 - 11, 2015. IEEE Computer Society 2015, ISBN 978-1-4799-8161-8
GPU optimization
- Qing Jiao, Mian Lu, Huynh Phung Huynh, Tulika Mitra

:
Improving GPGPU energy-efficiency through concurrent kernel execution and DVFS. 1-11 - Naznin Fauzia, Louis-Noël Pouchet, P. Sadayappan:

Characterizing and enhancing global memory data coalescing on GPUs. 12-22 - Chao Li

, Yi Yang, Zhen Lin
, Huiyang Zhou
:
Automatic data placement into GPU on-chip memory resources. 23-33
Tools and debugging
- Kyle Dewey, Vineeth Kashyap, Ben Hardekopf:

A parallel abstract interpreter for JavaScript. 34-45 - Evgeniy Stepanov, Konstantin Serebryany:

MemorySanitizer: fast detector of uninitialized memory use in C++. 46-55 - Long Zheng, Xiaofei Liao, Bingsheng He

, Song Wu, Hai Jin:
On performance debugging of unnecessary lock contentions on multicore processors: a replay-based approach. 56-67
Runtime optimization and techniques
- Byron Hawkins, Brian Demsky, Derek Bruening, Qin Zhao:

Optimizing binary translation of dynamically generated code. 68-78 - William Arthur, Ben Mehne, Reetuparna Das

, Todd M. Austin:
Getting in control of your control flow with control-data isolation. 79-90 - Jithendra Srinivas, Wei Ding, Mahmut T. Kandemir:

Reactive tiling. 91-102
Microarchitecture
- Erven Rohou, Bharath Narasimha Swamy

, André Seznec:
Branch prediction and the performance of interpreters: don't trust folklore. 103-114 - James Pallister, Kerstin Eder

, Simon J. Hollis:
Optimizing the flash-RAM energy trade-off in deeply embedded systems. 115-124 - Lawrence C. McAfee, Kunle Olukotun:

EMEURO: a framework for generating multi-purpose accelerators via deep learning. 125-135
Parallelism and concurrency
- Wai Teng Tang, Ruizhe Zhao, Mian Lu, Yun Liang, Huynh Phung Huyng, Xibai Li, Rick Siow Mong Goh:

Optimizing and auto-tuning scale-free sparse matrix-vector multiplication on Intel Xeon Phi. 136-145 - Brandon Lucia, Luis Ceze:

Data provenance tracking for concurrent programs. 146-156 - Sunil Shrestha, Guang R. Gao, Joseph B. Manzano

, Andrés Márquez
, John Feo:
Locality aware concurrent start for stencil applications. 157-166
Code generation and optimization
- Niranjan Hasabnis, Rui Qiao, R. Sekar:

Checking correctness of code generator architecture specifications. 167-178 - JinSeok Oh, Soo-Mook Moon:

Snapshot-based loading-time acceleration for web applications. 179-189
Static program analysis and optimization
- Vasileios Porpodas, Alberto Magni, Timothy M. Jones

:
PSLP: padded SLP automatic vectorization. 190-201 - Roland Leißa

, Marcel Köster, Sebastian Hack:
A graph-based higher-order intermediate representation. 202-212 - Cosmin E. Oancea, Lawrence Rauchwerger:

Scalable conditional induction variables (CIV) analysis. 213-224
Best paper session
- Vaivaswatha Nagaraj, R. Govindarajan:

Approximating flow-sensitive pointer analysis using frequent itemset mining. 225-234 - Simone Campanoni, Glenn H. Holloway, Gu-Yeon Wei, David M. Brooks:

HELIX-UP: relaxing program semantics to unleash parallelization. 235-245 - Xiaochun Zhang, Qi Guo, Yunji Chen

, Tianshi Chen, Weiwu Hu:
HERMES: a fast cross-ISA binary translator with post-optimization. 246-256 - Hee-Seok Kim, Izzat El Hajj, John A. Stratton, Steven S. Lumetta, Wen-mei W. Hwu:

Locality-centric thread scheduling for bulk-synchronous programming models on CPU architectures. 257-268

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














