Presentations and talks:
- Building
an Apache Spark Performance Lab: Tools and Techniques for
Optimization, CERN, April 2024. [pptx
| PDF
| sparkMeasure demo
| TPCDS-PySpark demo
| Spark-Dashboard
demo]
- Introduction
to Apache Spark APIs for Data Processing: Training course on Apache
Spark, November 2022. PDFs and videos (and notebooks) are available.
- Basic
Physics Analyses Implemented Using Apache Spark, PyHEP 2022, September
14th, 2022, pptx,
PDF,
PDF_extended_version,
Video
- Monitor
Apache Spark 3 on Kubernetes using Metrics and Plugins, Data+AI Summit
2021, May 26th, 2021, pptx,
PDF,
demo (mp4)
- What
is New with Apache Spark Performance Monitoring in Spark 3.0, Data+AI
Summit Europe 2020, November 18th, 2020, pptx,
PDF
- Big Data Tools and Pipelines
for Machine Learning in HEP, CERN EP-IT Data science seminar, December
4th, 2019, pptx,
PDF
- Performance
Troubleshooting Using Apache Spark Metrics, Spark Summit Europe 2019,
Amsterdam, October 17th, 2019, pptx,
pdf
- Deep
Learning Pipelines for High Energy Physics using Apache Spark with
Distributed Keras on Analytics Zoo, Spark Summit Europe 2019,
Amsterdam, October 16th, 2019, pptx,
pdf
- Big Data
In HEP - Physics Data Analysis, Machine learning and Data Reduction at
Scale with Apache Spark, IXPUG Annual Conference 2019, CERN September
24th, 2019, pptx,
pdf
- Apache
Spark for RDBMS Practitioners, Spark Summit Europe 2018, London,
October 4th, 2018, pptx,
pdf,
Video
- Data
Analytics – Use Cases, Platforms, Services @ CERN IT, ITMM Meeting, CERN,
March 5th, 2018, pptx, PDF
- Apache
Spark Performance Troubleshooting at Scale, Challenges, Tools, and Methods,
Spark Summit Europe 2017, Dublin, October 26th, 2017, pptx,
PDF,
Video
- Overview
of Big Data Solutions and Services at CERN, CERN Knowledge Transfer Forum,
CERN, September 29th, 2017, slides: pptx, PDF
- Hadoop
and Spark Ecosystem for Data Analytics, Experience and Outlook, WLCG GDB
meeting, CERN, September 13th, 2017, slides: pptx,
PDF
- Data
Analytics and CERN IT Hadoop Service, CERN
openlab Technical Workshop, CERN, December 9th, 2016, slides pptx, PDF
- Apache
Spark 2.0 Performance Improvements Investigated With Flame Graphs, Spark Summit Europe,
Brussels, October 26th, 2016, slides: pptx, PDF, Video
- Integration
of Oracle and Hadoop: hybrid databases affordable at scale, CHEP 2016, San Francisco, October 11th
2016, slides: pptx,
PDF
- Stack
Traces and Flame Graphs for Oracle Troubleshooting, UKOUG
Tech15 Super Sunday, Birmingham, December 6th, 2015, slides: pptx, PDF
- Modern
Linux Tools for Oracle Troubleshooting, Swiss Oracle User Group (SOUG) event, Prangins (CH), May 21st, 2015, PDF
- Database
Services During Run 2, WLCG Collaboration Workshop, Okinawa (JP),
April 11th, 2015, slides
- Modern
Linux Tools for Oracle Troubleshooting, UKOUG Tech14, Liverpool,
December 9th, 2014, slides, PDF
- A
Closer Look at CALIBRATE_IO, UKOUG Tech14, Liverpool, December 9th,
2014, slides, PDF
- Introduction
on Data for Physics at CERN and Deep Dive into Oracle ASM, Enkitec E4 2014, Dallas
(TX), June 2014, slides
- A Latency
Picture is Worth a Thousand Storage Metrics, Hotsos 2014, Dallas (TX),
March 4th, 2014, slides
- Lost
Writes, a DBA's Nightmare?, UKOUG Tech13, Manchester, December 4th,
2013, slides
- Storage
Latency for Oracle DBAs, UKOUG Tech13, Manchester, December 2nd, 2013,
slides
- Active
Data Guard at CERN, UKOUG Conference 2012, Birmingham, December 4th,
2012, slides
- Testing
Storage for Oracle RAC 11g with NAS, ASM, and SSD Flash Cache, UKOUG
Conference 2011, Birmingham, December 6th, 2011, slides
- CERN
IT-DB Deployment, Status, Outlook, ESA-GAIA DB Workshop, ISDC,
Geneva, March 2011, slides
- ACFS
Under Scrutiny, UKOUG Conference 2010, Birmingham, Nov 2010, slides
- Data
Lifecycle Management Challenges and Techniques, User’s Experience, UKOUG
Conference 2010, Birmingham, Nov 2010, slides
- CERN
DB Services for Physics - 2010 Report, Distributed Database Workshop,
CERN, Nov 2010, slides
- Overview
of the CERN DB Services for Physics, Orcan Swedish Oracle Users group
Conference, Stockholm, May 2010, slides
- Compressing
Very Large Data Sets in Oracle, UKOUG Conference 2009, Birmingham, Dec
2009, slides
- Evaluating
and Testing Storage Performance for Oracle DBs, UKOUG Conference 2009,
Birmingham, Dec 2009, slides
- ASM configuration review,
Distributed Database Operations Workshop, CERN, November 2009.
- Storage for data management,
'after C5' CERN-IT presentation, CERN, May 2009.
- Data Lifecycle
Review and Outlook, Distributed Database Operations Workshop,
Barcelona, April 2009
- Database Operations
Security Overview, Distributed Database Operations Workshop,
Barcelona, April 2009.
- Implementing
ASM Without HW RAID, User’s Experience, UKOUG Conference 2008,
Birmingham, Dec 2008, slides
- Datalifecycle_WLCG_DB_workshop_LC:
Data lifecycle management ideas, WLCG Database Workshop, CERN Nov 2008.
- Workhops_8_Jul_case_PVSS_LC:
Oracle performance tuning case study, PVSS archiver, Database Developers
Workshop, CERN 8-7-2008.
- Workhops_8_Jul_perf_for_developers_LC:
Oracle performance tuning ideas for developers, Database Developers
Workshop, CERN 8-7-2008.
- Oracle Storage
Performance Studies: Scalability tests of VLDBs with RAC and ASM
for Physics DB Services, WLCG Workshop, CERN, Apr 2008.
- A
Closer Look Inside Oracle ASM, UKOUG 2007, Birmingham, December 2007, slides
- WLCG_Oracle_perf_for_admin:
Oracle Performance for Administrators, WLCG Reliability Workshop, CERN
November 2007.
- Oracle_CERN_Service_Architecture, CERN DB workshop,
January 2007.
- Database
Services at CERN with Oracle 10g RAC and ASM on Commodity HW, UKOUG RAC
SIG meeting, London, October 2006, slides
- T2_tutorials_Jun06_OracleRAC:
Oracle and RAC for Physics DB June 2006
- DB services at CERN HEPIX 2006:
"Database Services for Physics at CERN with Oracle 10g RAC",
HEPIX Conference, Rome, April 2006.
- DB_Serv_Meeting_ASM_Perf:
"ASM-based storage to scale out Database Services for Physics",
April 2006
- 3D_RAL_Mar_06: "Oracle 10gR2
configuration", 3D DB Workshop at RAL (UK), March 2006.