-
Notifications
You must be signed in to change notification settings - Fork 0
/
scperf.tex
334 lines (238 loc) · 16.2 KB
/
scperf.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
\documentclass [draft,notitlepage] {article}
\usepackage[margin=1.0in,nohead,nofoot]{geometry}
\usepackage{url}
\usepackage{hyperref}
\pagestyle{empty}
\newcommand\myurl[2]{\url{#1}}
\newcommand\email[1]{\href{mailto:#1}{\nolinkurl{#1}}}
\title{CMS Software and Computing Performance Paper (Title TBD)}
\author{CMS Collaboration - Editors: K.Bloom and P.Elmer}
\begin{document}
\maketitle
\section{Introduction}
\subsection {Physics goals}
\begin{itemize}
\item describe quantitatively the scale required to reach physics goals
\item Higgs~\cite{CMSHIGGS} as possible example physics problem, should (briefly!) run through all the S\&C needs to do the physics analysis. (At the very least, Higgs provides the example of quick turnaround of physics -- when was the last data taken before 4/7/12 Higgs announcement?) Is there another physics problem that has some contrasting requirements? Another physics requirement -- must get results out fast due to competition and great community interest!
\end{itemize}
\subsection{The Large Hadron Collider and the CMS detector}
\begin{itemize}
\item Scale and requirements of the LHC
\item Scale and requirements of the CMS detector
\item Event size, Pile-Up (e.g. effect on multiplicities, etc.)
\end{itemize}
\subsection{Software and Computing System}
\begin{itemize}
\item Describe the major limitations and challenges in building the S\&C system
\item requirements of distributed development and distributed computing facilities:
\item These include: large group of code developers whose are ultimately novice coders, wide geographical distribution of developers, highly distributed computing resources too, newfangled grid system that had to be shaken down, very distributed and ultimately novice-coder analysts! (I think we should be covering analysis in here too)...what else do we need to add to this list?
\end{itemize}
Goals of software and computing:
\begin{itemize}
\item Software development model must incorporate the work of many geographically distributed coders
\item Software must run on all necessary architectures, must be robust against potentially fragile computing facilities
\item Software needs to be able to run at a scale needed to turn around results quickly
\item Computing must run the software at the sufficient scale
\item Computing must make the best use of all available resources
\item Computing must make the data and computing resources available to all analysts
\item In general, software and computing should never limit the rate of the production of physics results and papers (``factory'' latency) Do we have a way of demonstrating this actually happened?
\end{itemize}
\section{Software Applications}
Describe what are we trying to do with the computing:
\begin{itemize}
\item Trigger (how much of this is ``us''? HLT?)
% CHEP 2013 "The CMS High Level Trigger" D. Troncino
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=1085&pid=7050
% CHEP12 "The CMS High Level Trigger System: Experience and Future
% Development" A. Sparatu
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=665&pid=4520
\item Reconstruction
% CHEP13 "The Role of Effective Event Reconstruction in the Higgs Boson Discovery at CMS", S. Krutelyov
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=1085&pid=7258
\item Analysis
\item MC simulation
% CHEP13 - there is both a fastsim talk and a fullsim talk, but I think
% we combined these. I'm confused by what we have in Cinco, will double
% check on CHEP13 site (I'm one of the track coordinators for that CHEP track)
\item Calibration/alignment
% CHEP13 "Alignment and calibration of CMS detector during collisions at LHC" R.Castello
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=1085&pid=7170
\item I considered also things like data management here, but that strictly speaking isn't something needed to get a physics result, which I think is what this section is about. Anything missing from the list?
\end{itemize}
\section{Software Implementation}
\begin{itemize}
\item Software development model
\item Framework architecture
% Perhaps some of the text can come from the paper about the new
% framework evolution:
% CHEP13 "Stitched Together: Transitioning CMS to a Hierarchical Threaded Framework" C. Jones
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=1085&pid=7110
% or perhaps there are older CHEP papers.
\item Software architecture and evolution
\item ``Performance'' numbers (code base size, number of developers, etc.)
over time, major releases
% CHEP10 "The CMS Reconstruction Software" D. Lange
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=462&pid=2346
\item Software and Release validation
% CHEP13 "The Rise of the Build Infrastructure" G.Eulisse
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/conf_display.aspx?cid=1085
% CHEP 10 "Release Strategies: CMS approach for Development and Quality
% Assurance", E. Sexton-Kennedy
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=462&pid=2344
\item Evolution with architectures (32bit/64bit, compilers)
% Various ACAT/CHEP presentations from P.Elmer, M.Kortelainen
\item CPU and I/O optimization?
% Various ACAT/CHEP presentations from P.Elmer, M.Kortelainen
\item Documentation?
% Surely we have documentation?
% CHEP12 "Developing CMS software documentation system" M. Stankevicius
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=665&pid=4506
\end{itemize}
\section{Computing Implementation}
(Not sure if ordering is quite right..)
\begin{itemize}
\item Distributed Model (details in practice including tier system, numbers)
% CHEP12 "Trying to Predict the Future - Resource Planning and Allocation
% in CMS" P. Kreuzer
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=665&pid=4582
% CHEP10 "Experience with the CMS Computing Model from commissioning to
% collisions" D. Bonacorsi
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=462&pid=2306
% CHEP10 "Monitoring the Readiness and Utilization of the Distributed CMS
% Computing Facilities during the first year LHC running" J. Hernandez
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=462&pid=2332
% Note that we don't currently have monitoring in here as an explicit
% topic. But it probably needs to be discussed in operations, at the very least.
% CHEP10 "CMS Distributed Computing Integration in the LHC sustained
% operations era" C. Grandi
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=462&pid=2264
\item Calibration/alignment procedure (prompt calibration loop,
infrastructure, subsequent access\ldots)
% CHEP10 "Alignment & calibration experience under LHC data-taking
% conditions in the CMS experiment" R. Mankel
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=462&pid=2324
% CHEP12 "Handling of time-critical Conditions Data in the CMS experiment -
% Experience of the first year of data taking" G. Govi
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=665&pid=4571
% CHEP12 "Comparison of the Frontier Distributed Database Caching System with NoSQL Databases" D. Dykstra
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=665&pid=4418
% CHEP12 "CMS experience with online and offline Databases" A. Pfeiffer
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=665&pid=4444
% CHEP12 "Operational Experience with the Frontier System in CMS"
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=665&pid=4540
% CHEP10 "Time-critical database condition data handling in the CMS
% experiment during the first data taking period" S. Di Guida
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=462&pid=2300
% CHEP10 "CMS Online Database experience with first data" M. Janulis
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=462&pid=2310
\item Operations Model (does this include site operations?)
% CHEP13 "CMS Computing Operations During Run1" O.Gutsche
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=1085&pid=7055
% CHEP12 "Towards a global monitoring system for CMS computing operations"
% A. Sciaba
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=665&pid=4445
% CHEP12 "CMS Tier-0: Preparing for the future" D. Hufnagel
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=665&pid=4424
% CHEP12 "Towards higher reliability of CMS Computing Facilities" J. Flix
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=665&pid=4539
% CHEP12 "PREP: Production and Reprocessing management tool for CMS"
% F. Cossutti
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=665&pid=4566
% Note -- now we're wandering towards production operations -- we do need a
% way to cover that in the paper, I think. Is that part of "operations
% model"? Can we leverage the IEEE paper, should it exist?
% CHEP10 "CMS Distributed Computing Workflow Experience" J. Haas
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=462&pid=2296
% CHEP10 "Deployment of the CMS software on the WLCG Grid" W. Behrenhoff
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=462&pid=2278
% CHEP10 "The architecture and operation of the CMS Tier-0" D. Hufnagel
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=462&pid=2315
% CHEP10 "24/7 Monitoring of the CMS Computing infrastructure and
% facilities" P. Kreuzer
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=462&pid=2330
\item Data transfers/network usage
% CHEP13 "The CMS Data Management System" N. Magini
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=1085&pid=7057
% CHEP12 "CMS Data Transfer operations after the first years of LHC
% collisions" R. Kaselis
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=665&pid=4537
% CHEP12 "Performance studies and improvements of CMS Distributed Data
% Transfers", J. Flix
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=665&pid=4536
% CHEP10 "Large Scale Commissioning and Operational Experience with Tier-2
% to Tier-2 Data Transfer Links in CMS" J. Letts
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=462&pid=2265
% CHEP10 "Improving CMS data transfers among its distributed Computing
% Facilities" N. Magini
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=462&pid=2323
\item Data quality monitoring (or is this a software topic?)
\item Data Management (design choices, implementation and performance)
% CHEP12 "From toolkit to framework - the past and future evolution of
% PhEDEx" T. Wildish
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=665&pid=4552
% CHEP10 "Data Aggregation System, an information retrieval on demand over
% relational and non-relational distributed data sources." V. Kuznetsov
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=462&pid=2317
\item Workflow Management (design choices, implementation and performance)
% CHEP12 "The CMS workload management system" S. Wakefield
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=665&pid=4454
% CHEP12 "A new era for central processing and production in CMS"
% E. Fajardo Hernandez
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=665&pid=4402
\item User Analysis (+ Support??)
% CHEP12 "CMS Analysis Deconstructed" S. Malik
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=665&pid=4399
% CHEP12 "Maintaining and improving of the training program on the analysis
% software in CMS" S. Malik
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=665&pid=4570
% CHEP10 "Perspective of User Support for the CMS Collaboration" S. Malik
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=462&pid=2263
% CHEP10 "A tour of the CMS Physics Analysis Model" B. Hegner
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=462&pid=2309
% CHEP10 "CMS distributed analysis infrastructure and operations:
% experience with the first LHC data" E. Vaandering
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=462&pid=2338
% CHEP10 "Design and early experience with promoting user-created data in
% CMS" M. Giffels
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=462&pid=2293
\item ``Performance'' numbers (resource usage, data sizes, analysis participation, turnaround times if that can be captured) [Or should this be integrated with other sections, somehow? Editorial choice to consider.]
% CHEP12 "CMS resource utilization and limitations on the grid after the
% first two years of LHC collisions" K. Bloom
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=665&pid=4542
% CHEP10 "Measuring and Understanding Computing Resource Utilization in
% CMS" J. Letts
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=462&pid=2268
\end{itemize}
\section{Anticipated Evolution for Run 2}
\begin{itemize}
\item Multithreaded framework
%CHEP12 "Study of a Fine Grained Threaded Framework Design" C. Jones
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=665&pid=4396
%CHEP12 "Multi-core processing and scheduling performance in CMS"
%J. Hernandez
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=665&pid=4365
%CHEP10 "Multicore-aware applications in CMS" C. Jones
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=462&pid=2342
\item Software engineering efforts
%CHEP12 "Development and Evaluation of Vectorised and Multi-Core Event
%Reconstruction Algorithms within the CMS Software Framework" D. Piparo
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=665&pid=4546
\item Evolution of tiered computing model
% CHEP12 "Evolution of the Distributed Computing Model of the CMS
% experiment at the LHC" C. Grandi
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=665&pid=4347
\item Use of data federations, changes in data distribution
%CHEP12 "Implementing data placement strategies for the CMS experiment
%based on a popularity mode" D. Giordano
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=665&pid=4443
\item Efforts on opportunistic resources
% CHEP12 "Controlled overflowing of data-intensive jobs from oversubscribed
% sites" I. Sfiligoi
% https://cms-mgt-conferences.web.cern.ch/cms-mgt-conferences/conferences/pres_display.aspx?cid=665&pid=4419
\end{itemize}
\section{Conclusion}
Would be good to get back to the physics -- discuss cases where we got results out quick (Higgs), turned around new samples quickly, got new releases or calibrations out fast. These are ultimately the measures of our success! Or put another way, here is where we should clearly emphasize that we’ve met some of the goals described earlier.
\newpage
\bibliographystyle{unsrt}
\bibliography{references}
\end{document}