-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathtutorial-text-manipulation-with-stringr.html
647 lines (606 loc) · 64.7 KB
/
tutorial-text-manipulation-with-stringr.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
<!DOCTYPE html>
<html lang="" xml:lang="">
<head>
<meta charset="utf-8" />
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
<title> 6 Tutorial: Text manipulation with stringr | Automated procedures for analyzing business communication strategies: Focusing on the board game industry</title>
<meta name="description" content="Automated Analysis - B.A. Seminar at the IfKW, SS 2023" />
<meta name="generator" content="bookdown 0.34 and GitBook 2.6.7" />
<meta property="og:title" content=" 6 Tutorial: Text manipulation with stringr | Automated procedures for analyzing business communication strategies: Focusing on the board game industry" />
<meta property="og:type" content="book" />
<meta property="og:description" content="Automated Analysis - B.A. Seminar at the IfKW, SS 2023" />
<meta name="twitter:card" content="summary" />
<meta name="twitter:title" content=" 6 Tutorial: Text manipulation with stringr | Automated procedures for analyzing business communication strategies: Focusing on the board game industry" />
<meta name="twitter:description" content="Automated Analysis - B.A. Seminar at the IfKW, SS 2023" />
<meta name="author" content="Lara Kobilke, IfKW, Ludwig-Maximilians-Universität München" />
<meta name="date" content="2023-07-06" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<meta name="apple-mobile-web-app-capable" content="yes" />
<meta name="apple-mobile-web-app-status-bar-style" content="black" />
<link rel="prev" href="exercise-4-test-your-knowledge.html"/>
<link rel="next" href="exercise-5-test-your-knowledge.html"/>
<script src="libs/jquery-3.6.0/jquery-3.6.0.min.js"></script>
<script src="https://cdn.jsdelivr.net/npm/fuse.js@6.4.6/dist/fuse.min.js"></script>
<link href="libs/gitbook-2.6.7/css/style.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-table.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-bookdown.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-highlight.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-search.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-fontsettings.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-clipboard.css" rel="stylesheet" />
<link href="libs/anchor-sections-1.1.0/anchor-sections.css" rel="stylesheet" />
<link href="libs/anchor-sections-1.1.0/anchor-sections-hash.css" rel="stylesheet" />
<script src="libs/anchor-sections-1.1.0/anchor-sections.js"></script>
<style type="text/css">
pre > code.sourceCode { white-space: pre; position: relative; }
pre > code.sourceCode > span { display: inline-block; line-height: 1.25; }
pre > code.sourceCode > span:empty { height: 1.2em; }
.sourceCode { overflow: visible; }
code.sourceCode > span { color: inherit; text-decoration: inherit; }
pre.sourceCode { margin: 0; }
@media screen {
div.sourceCode { overflow: auto; }
}
@media print {
pre > code.sourceCode { white-space: pre-wrap; }
pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
}
pre.numberSource code
{ counter-reset: source-line 0; }
pre.numberSource code > span
{ position: relative; left: -4em; counter-increment: source-line; }
pre.numberSource code > span > a:first-child::before
{ content: counter(source-line);
position: relative; left: -1em; text-align: right; vertical-align: baseline;
border: none; display: inline-block;
-webkit-touch-callout: none; -webkit-user-select: none;
-khtml-user-select: none; -moz-user-select: none;
-ms-user-select: none; user-select: none;
padding: 0 4px; width: 4em;
color: #aaaaaa;
}
pre.numberSource { margin-left: 3em; border-left: 1px solid #aaaaaa; padding-left: 4px; }
div.sourceCode
{ }
@media screen {
pre > code.sourceCode > span > a:first-child::before { text-decoration: underline; }
}
code span.al { color: #ff0000; font-weight: bold; } /* Alert */
code span.an { color: #60a0b0; font-weight: bold; font-style: italic; } /* Annotation */
code span.at { color: #7d9029; } /* Attribute */
code span.bn { color: #40a070; } /* BaseN */
code span.bu { color: #008000; } /* BuiltIn */
code span.cf { color: #007020; font-weight: bold; } /* ControlFlow */
code span.ch { color: #4070a0; } /* Char */
code span.cn { color: #880000; } /* Constant */
code span.co { color: #60a0b0; font-style: italic; } /* Comment */
code span.cv { color: #60a0b0; font-weight: bold; font-style: italic; } /* CommentVar */
code span.do { color: #ba2121; font-style: italic; } /* Documentation */
code span.dt { color: #902000; } /* DataType */
code span.dv { color: #40a070; } /* DecVal */
code span.er { color: #ff0000; font-weight: bold; } /* Error */
code span.ex { } /* Extension */
code span.fl { color: #40a070; } /* Float */
code span.fu { color: #06287e; } /* Function */
code span.im { color: #008000; font-weight: bold; } /* Import */
code span.in { color: #60a0b0; font-weight: bold; font-style: italic; } /* Information */
code span.kw { color: #007020; font-weight: bold; } /* Keyword */
code span.op { color: #666666; } /* Operator */
code span.ot { color: #007020; } /* Other */
code span.pp { color: #bc7a00; } /* Preprocessor */
code span.sc { color: #4070a0; } /* SpecialChar */
code span.ss { color: #bb6688; } /* SpecialString */
code span.st { color: #4070a0; } /* String */
code span.va { color: #19177c; } /* Variable */
code span.vs { color: #4070a0; } /* VerbatimString */
code span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } /* Warning */
</style>
<style type="text/css">
div.hanging-indent{margin-left: 1.5em; text-indent: -1.5em;}
</style>
<link rel="stylesheet" href="style.css" type="text/css" />
</head>
<body>
<div class="book without-animation with-summary font-size-2 font-family-1" data-basepath=".">
<div class="book-summary">
<nav role="navigation">
<ul class="summary">
<li><a href="./">Automated Analysis</a></li>
<li class="divider"></li>
<li class="chapter" data-level="" data-path="index.html"><a href="index.html"><i class="fa fa-check"></i>General information on the course</a>
<ul>
<li class="chapter" data-level="" data-path="index.html"><a href="index.html#what-can-i-learn-from-this-tutorial"><i class="fa fa-check"></i>What can I learn from this tutorial?</a></li>
<li class="chapter" data-level="" data-path="index.html"><a href="index.html#what-can-i-do-if-i-have-problems-with-my-r-code"><i class="fa fa-check"></i>What can I do if I have problems with my R code?</a></li>
</ul></li>
<li class="chapter" data-level="1" data-path="tutorial-installing-understanding-rr-studio.html"><a href="tutorial-installing-understanding-rr-studio.html"><i class="fa fa-check"></i><b>1</b> Tutorial: Installing & Understanding R/R Studio</a>
<ul>
<li class="chapter" data-level="1.1" data-path="tutorial-installing-understanding-rr-studio.html"><a href="tutorial-installing-understanding-rr-studio.html#what-is-r"><i class="fa fa-check"></i><b>1.1</b> What is R?</a></li>
<li class="chapter" data-level="1.2" data-path="tutorial-installing-understanding-rr-studio.html"><a href="tutorial-installing-understanding-rr-studio.html#installing-r"><i class="fa fa-check"></i><b>1.2</b> Installing R</a></li>
<li class="chapter" data-level="1.3" data-path="tutorial-installing-understanding-rr-studio.html"><a href="tutorial-installing-understanding-rr-studio.html#installing-r-studio"><i class="fa fa-check"></i><b>1.3</b> Installing R Studio</a></li>
<li class="chapter" data-level="1.4" data-path="tutorial-installing-understanding-rr-studio.html"><a href="tutorial-installing-understanding-rr-studio.html#updating-r-and-r-studio"><i class="fa fa-check"></i><b>1.4</b> Updating R and R Studio</a>
<ul>
<li class="chapter" data-level="1.4.1" data-path="tutorial-installing-understanding-rr-studio.html"><a href="tutorial-installing-understanding-rr-studio.html#on-windows"><i class="fa fa-check"></i><b>1.4.1</b> On Windows</a></li>
<li class="chapter" data-level="1.4.2" data-path="tutorial-installing-understanding-rr-studio.html"><a href="tutorial-installing-understanding-rr-studio.html#on-mac"><i class="fa fa-check"></i><b>1.4.2</b> On MAC</a></li>
</ul></li>
<li class="chapter" data-level="1.5" data-path="tutorial-installing-understanding-rr-studio.html"><a href="tutorial-installing-understanding-rr-studio.html#how-does-r-work"><i class="fa fa-check"></i><b>1.5</b> How does R work?</a></li>
<li class="chapter" data-level="1.6" data-path="tutorial-installing-understanding-rr-studio.html"><a href="tutorial-installing-understanding-rr-studio.html#why-should-i-use-r"><i class="fa fa-check"></i><b>1.6</b> Why should I use R?</a></li>
<li class="chapter" data-level="1.7" data-path="tutorial-installing-understanding-rr-studio.html"><a href="tutorial-installing-understanding-rr-studio.html#how-does-r-studio-work"><i class="fa fa-check"></i><b>1.7</b> How does R Studio work?</a>
<ul>
<li class="chapter" data-level="1.7.1" data-path="tutorial-installing-understanding-rr-studio.html"><a href="tutorial-installing-understanding-rr-studio.html#source-writing-your-own-code"><i class="fa fa-check"></i><b>1.7.1</b> Source: Writing your own code</a></li>
<li class="chapter" data-level="1.7.2" data-path="tutorial-installing-understanding-rr-studio.html"><a href="tutorial-installing-understanding-rr-studio.html#console-printing-results"><i class="fa fa-check"></i><b>1.7.2</b> Console: Printing results</a></li>
<li class="chapter" data-level="1.7.3" data-path="tutorial-installing-understanding-rr-studio.html"><a href="tutorial-installing-understanding-rr-studio.html#environment-overview-of-objects"><i class="fa fa-check"></i><b>1.7.3</b> Environment: Overview of objects</a></li>
<li class="chapter" data-level="1.7.4" data-path="tutorial-installing-understanding-rr-studio.html"><a href="tutorial-installing-understanding-rr-studio.html#plotshelppackages-do-everything-else"><i class="fa fa-check"></i><b>1.7.4</b> Plots/Help/Packages: Do everything else</a></li>
</ul></li>
<li class="chapter" data-level="1.8" data-path="tutorial-installing-understanding-rr-studio.html"><a href="tutorial-installing-understanding-rr-studio.html#why-arent-we-using-a-gui-for-point-and-click-analyses"><i class="fa fa-check"></i><b>1.8</b> Why aren’t we using a GUI for point-and-click analyses?</a></li>
<li class="chapter" data-level="1.9" data-path="tutorial-installing-understanding-rr-studio.html"><a href="tutorial-installing-understanding-rr-studio.html#packages"><i class="fa fa-check"></i><b>1.9</b> Packages</a>
<ul>
<li class="chapter" data-level="1.9.1" data-path="tutorial-installing-understanding-rr-studio.html"><a href="tutorial-installing-understanding-rr-studio.html#installing-packages"><i class="fa fa-check"></i><b>1.9.1</b> Installing packages</a></li>
<li class="chapter" data-level="1.9.2" data-path="tutorial-installing-understanding-rr-studio.html"><a href="tutorial-installing-understanding-rr-studio.html#activating-packages"><i class="fa fa-check"></i><b>1.9.2</b> Activating packages</a></li>
<li class="chapter" data-level="1.9.3" data-path="tutorial-installing-understanding-rr-studio.html"><a href="tutorial-installing-understanding-rr-studio.html#getting-information-about-packages"><i class="fa fa-check"></i><b>1.9.3</b> Getting information about packages</a></li>
</ul></li>
<li class="chapter" data-level="1.10" data-path="tutorial-installing-understanding-rr-studio.html"><a href="tutorial-installing-understanding-rr-studio.html#take-aways"><i class="fa fa-check"></i><b>1.10</b> Take-Aways</a></li>
<li class="chapter" data-level="1.11" data-path="tutorial-installing-understanding-rr-studio.html"><a href="tutorial-installing-understanding-rr-studio.html#additional-tutorials"><i class="fa fa-check"></i><b>1.11</b> Additional tutorials</a></li>
</ul></li>
<li class="chapter" data-level="2" data-path="tutorial-using-r-as-a-calculator.html"><a href="tutorial-using-r-as-a-calculator.html"><i class="fa fa-check"></i><b>2</b> Tutorial: Using R as a calculator</a>
<ul>
<li class="chapter" data-level="2.1" data-path="tutorial-using-r-as-a-calculator.html"><a href="tutorial-using-r-as-a-calculator.html#using-variables-for-calculation"><i class="fa fa-check"></i><b>2.1</b> Using variables for calculation</a></li>
<li class="chapter" data-level="2.2" data-path="tutorial-using-r-as-a-calculator.html"><a href="tutorial-using-r-as-a-calculator.html#using-vectors-for-calculation"><i class="fa fa-check"></i><b>2.2</b> Using vectors for calculation</a></li>
<li class="chapter" data-level="2.3" data-path="tutorial-using-r-as-a-calculator.html"><a href="tutorial-using-r-as-a-calculator.html#selecting-values-from-a-vector"><i class="fa fa-check"></i><b>2.3</b> Selecting values from a vector</a></li>
<li class="chapter" data-level="2.4" data-path="tutorial-using-r-as-a-calculator.html"><a href="tutorial-using-r-as-a-calculator.html#take-aways-1"><i class="fa fa-check"></i><b>2.4</b> Take-Aways</a></li>
<li class="chapter" data-level="2.5" data-path="tutorial-using-r-as-a-calculator.html"><a href="tutorial-using-r-as-a-calculator.html#additional-tutorials-1"><i class="fa fa-check"></i><b>2.5</b> Additional tutorials</a></li>
</ul></li>
<li class="chapter" data-level="" data-path="exercise-1.html"><a href="exercise-1.html"><i class="fa fa-check"></i>Exercise 1</a>
<ul>
<li class="chapter" data-level="" data-path="exercise-1.html"><a href="exercise-1.html#task-1"><i class="fa fa-check"></i>Task 1</a></li>
<li class="chapter" data-level="" data-path="exercise-1.html"><a href="exercise-1.html#task-2"><i class="fa fa-check"></i>Task 2</a></li>
<li class="chapter" data-level="" data-path="exercise-1.html"><a href="exercise-1.html#task-3"><i class="fa fa-check"></i>Task 3</a></li>
</ul></li>
<li class="chapter" data-level="3" data-path="tutorial-working-with-data-files.html"><a href="tutorial-working-with-data-files.html"><i class="fa fa-check"></i><b>3</b> Tutorial: Working with data (files)</a>
<ul>
<li class="chapter" data-level="3.1" data-path="tutorial-working-with-data-files.html"><a href="tutorial-working-with-data-files.html#defining-your-working-directory"><i class="fa fa-check"></i><b>3.1</b> Defining your working directory</a>
<ul>
<li class="chapter" data-level="3.1.1" data-path="tutorial-working-with-data-files.html"><a href="tutorial-working-with-data-files.html#optional-setting-the-working-directory-on-a-remote-desktop"><i class="fa fa-check"></i><b>3.1.1</b> Optional: Setting the working directory on a remote desktop</a></li>
</ul></li>
<li class="chapter" data-level="3.2" data-path="tutorial-working-with-data-files.html"><a href="tutorial-working-with-data-files.html#import-data-from-your-working-directory"><i class="fa fa-check"></i><b>3.2</b> Import data from your working directory</a></li>
<li class="chapter" data-level="3.3" data-path="tutorial-working-with-data-files.html"><a href="tutorial-working-with-data-files.html#subsetting-variables-columns-in-data-frames"><i class="fa fa-check"></i><b>3.3</b> Subsetting variables / columns in data frames</a></li>
<li class="chapter" data-level="3.4" data-path="tutorial-working-with-data-files.html"><a href="tutorial-working-with-data-files.html#subsetting-observations-rows-in-data-frames"><i class="fa fa-check"></i><b>3.4</b> Subsetting observations / rows in data frames</a></li>
<li class="chapter" data-level="3.5" data-path="tutorial-working-with-data-files.html"><a href="tutorial-working-with-data-files.html#subsetting-values-cells-in-data-frames"><i class="fa fa-check"></i><b>3.5</b> Subsetting values / cells in data frames</a></li>
<li class="chapter" data-level="3.6" data-path="tutorial-working-with-data-files.html"><a href="tutorial-working-with-data-files.html#subsetting-data-with-conditions"><i class="fa fa-check"></i><b>3.6</b> Subsetting data with conditions</a></li>
<li class="chapter" data-level="3.7" data-path="tutorial-working-with-data-files.html"><a href="tutorial-working-with-data-files.html#optional-import-data-from-various-file-formats"><i class="fa fa-check"></i><b>3.7</b> Optional: Import data from various file formats</a>
<ul>
<li class="chapter" data-level="3.7.1" data-path="tutorial-working-with-data-files.html"><a href="tutorial-working-with-data-files.html#importing-data-from-excel"><i class="fa fa-check"></i><b>3.7.1</b> Importing data from Excel</a></li>
<li class="chapter" data-level="3.7.2" data-path="tutorial-working-with-data-files.html"><a href="tutorial-working-with-data-files.html#importing-data-from-json"><i class="fa fa-check"></i><b>3.7.2</b> Importing data from JSON</a></li>
<li class="chapter" data-level="3.7.3" data-path="tutorial-working-with-data-files.html"><a href="tutorial-working-with-data-files.html#importing-data-from-spss"><i class="fa fa-check"></i><b>3.7.3</b> Importing data from SPSS</a></li>
</ul></li>
<li class="chapter" data-level="3.8" data-path="tutorial-working-with-data-files.html"><a href="tutorial-working-with-data-files.html#take-aways-2"><i class="fa fa-check"></i><b>3.8</b> Take-Aways</a></li>
<li class="chapter" data-level="3.9" data-path="tutorial-working-with-data-files.html"><a href="tutorial-working-with-data-files.html#additional-tutorials-2"><i class="fa fa-check"></i><b>3.9</b> Additional tutorials</a></li>
</ul></li>
<li class="chapter" data-level="" data-path="exercise-2.html"><a href="exercise-2.html"><i class="fa fa-check"></i>Exercise 2</a>
<ul>
<li class="chapter" data-level="" data-path="exercise-2.html"><a href="exercise-2.html#task-1-1"><i class="fa fa-check"></i>Task 1</a></li>
<li class="chapter" data-level="" data-path="exercise-2.html"><a href="exercise-2.html#task-2-1"><i class="fa fa-check"></i>Task 2</a></li>
<li class="chapter" data-level="" data-path="exercise-2.html"><a href="exercise-2.html#task-3-1"><i class="fa fa-check"></i>Task 3</a></li>
</ul></li>
<li class="chapter" data-level="4" data-path="tutorial-data-management-with-tidyverse.html"><a href="tutorial-data-management-with-tidyverse.html"><i class="fa fa-check"></i><b>4</b> Tutorial: Data management with tidyverse</a>
<ul>
<li class="chapter" data-level="4.1" data-path="tutorial-data-management-with-tidyverse.html"><a href="tutorial-data-management-with-tidyverse.html#why-not-stick-with-base-r"><i class="fa fa-check"></i><b>4.1</b> Why not stick with Base R?</a></li>
<li class="chapter" data-level="4.2" data-path="tutorial-data-management-with-tidyverse.html"><a href="tutorial-data-management-with-tidyverse.html#tidyverse-packages"><i class="fa fa-check"></i><b>4.2</b> Tidyverse packages</a></li>
<li class="chapter" data-level="4.3" data-path="tutorial-data-management-with-tidyverse.html"><a href="tutorial-data-management-with-tidyverse.html#tidy-data"><i class="fa fa-check"></i><b>4.3</b> Tidy data</a></li>
<li class="chapter" data-level="4.4" data-path="tutorial-data-management-with-tidyverse.html"><a href="tutorial-data-management-with-tidyverse.html#the-pipe-operator"><i class="fa fa-check"></i><b>4.4</b> The pipe operator</a></li>
<li class="chapter" data-level="4.5" data-path="tutorial-data-management-with-tidyverse.html"><a href="tutorial-data-management-with-tidyverse.html#data-transformation-with-dplyr"><i class="fa fa-check"></i><b>4.5</b> Data transformation with dplyr</a>
<ul>
<li class="chapter" data-level="4.5.1" data-path="tutorial-data-management-with-tidyverse.html"><a href="tutorial-data-management-with-tidyverse.html#select"><i class="fa fa-check"></i><b>4.5.1</b> select()</a></li>
<li class="chapter" data-level="4.5.2" data-path="tutorial-data-management-with-tidyverse.html"><a href="tutorial-data-management-with-tidyverse.html#filter"><i class="fa fa-check"></i><b>4.5.2</b> filter()</a></li>
<li class="chapter" data-level="4.5.3" data-path="tutorial-data-management-with-tidyverse.html"><a href="tutorial-data-management-with-tidyverse.html#arrange"><i class="fa fa-check"></i><b>4.5.3</b> arrange()</a></li>
<li class="chapter" data-level="4.5.4" data-path="tutorial-data-management-with-tidyverse.html"><a href="tutorial-data-management-with-tidyverse.html#mutate"><i class="fa fa-check"></i><b>4.5.4</b> mutate()</a></li>
<li class="chapter" data-level="4.5.5" data-path="tutorial-data-management-with-tidyverse.html"><a href="tutorial-data-management-with-tidyverse.html#summarize-group_by"><i class="fa fa-check"></i><b>4.5.5</b> summarize() [+ group_by()]</a></li>
<li class="chapter" data-level="4.5.6" data-path="tutorial-data-management-with-tidyverse.html"><a href="tutorial-data-management-with-tidyverse.html#chaining-functions-in-a-pipe"><i class="fa fa-check"></i><b>4.5.6</b> Chaining functions in a pipe</a></li>
</ul></li>
<li class="chapter" data-level="4.6" data-path="tutorial-data-management-with-tidyverse.html"><a href="tutorial-data-management-with-tidyverse.html#take-aways-3"><i class="fa fa-check"></i><b>4.6</b> Take-Aways</a></li>
<li class="chapter" data-level="4.7" data-path="tutorial-data-management-with-tidyverse.html"><a href="tutorial-data-management-with-tidyverse.html#additional-tutorials-3"><i class="fa fa-check"></i><b>4.7</b> Additional tutorials</a></li>
</ul></li>
<li class="chapter" data-level="" data-path="exercise-3-test-your-knowledge.html"><a href="exercise-3-test-your-knowledge.html"><i class="fa fa-check"></i>Exercise 3: Test your knowledge</a>
<ul>
<li class="chapter" data-level="" data-path="exercise-3-test-your-knowledge.html"><a href="exercise-3-test-your-knowledge.html#task-1-2"><i class="fa fa-check"></i>Task 1</a></li>
<li class="chapter" data-level="" data-path="exercise-3-test-your-knowledge.html"><a href="exercise-3-test-your-knowledge.html#task-2-2"><i class="fa fa-check"></i>Task 2</a></li>
<li class="chapter" data-level="" data-path="exercise-3-test-your-knowledge.html"><a href="exercise-3-test-your-knowledge.html#task-3-2"><i class="fa fa-check"></i>Task 3</a></li>
<li class="chapter" data-level="" data-path="exercise-3-test-your-knowledge.html"><a href="exercise-3-test-your-knowledge.html#task-4"><i class="fa fa-check"></i>Task 4</a></li>
<li class="chapter" data-level="" data-path="exercise-3-test-your-knowledge.html"><a href="exercise-3-test-your-knowledge.html#task-5"><i class="fa fa-check"></i>Task 5</a></li>
<li class="chapter" data-level="" data-path="exercise-3-test-your-knowledge.html"><a href="exercise-3-test-your-knowledge.html#task-6"><i class="fa fa-check"></i>Task 6</a></li>
<li class="chapter" data-level="" data-path="exercise-3-test-your-knowledge.html"><a href="exercise-3-test-your-knowledge.html#task-7"><i class="fa fa-check"></i>Task 7</a></li>
</ul></li>
<li class="chapter" data-level="5" data-path="tutorial-data-visualization-with-ggplot.html"><a href="tutorial-data-visualization-with-ggplot.html"><i class="fa fa-check"></i><b>5</b> Tutorial: Data visualization with ggplot</a>
<ul>
<li class="chapter" data-level="5.1" data-path="tutorial-data-visualization-with-ggplot.html"><a href="tutorial-data-visualization-with-ggplot.html#why-not-stick-with-base-r-1"><i class="fa fa-check"></i><b>5.1</b> Why not stick with Base R?</a></li>
<li class="chapter" data-level="5.2" data-path="tutorial-data-visualization-with-ggplot.html"><a href="tutorial-data-visualization-with-ggplot.html#components-of-a-ggplot-graph"><i class="fa fa-check"></i><b>5.2</b> Components of a ggplot graph</a></li>
<li class="chapter" data-level="5.3" data-path="tutorial-data-visualization-with-ggplot.html"><a href="tutorial-data-visualization-with-ggplot.html#installing-activating-ggplot"><i class="fa fa-check"></i><b>5.3</b> Installing & activating ggplot</a></li>
<li class="chapter" data-level="5.4" data-path="tutorial-data-visualization-with-ggplot.html"><a href="tutorial-data-visualization-with-ggplot.html#building-your-first-plot"><i class="fa fa-check"></i><b>5.4</b> Building your first plot</a>
<ul>
<li class="chapter" data-level="5.4.1" data-path="tutorial-data-visualization-with-ggplot.html"><a href="tutorial-data-visualization-with-ggplot.html#data"><i class="fa fa-check"></i><b>5.4.1</b> Data</a></li>
<li class="chapter" data-level="5.4.2" data-path="tutorial-data-visualization-with-ggplot.html"><a href="tutorial-data-visualization-with-ggplot.html#aesthetics"><i class="fa fa-check"></i><b>5.4.2</b> Aesthetics</a></li>
<li class="chapter" data-level="5.4.3" data-path="tutorial-data-visualization-with-ggplot.html"><a href="tutorial-data-visualization-with-ggplot.html#geometrics"><i class="fa fa-check"></i><b>5.4.3</b> Geometrics</a></li>
<li class="chapter" data-level="5.4.4" data-path="tutorial-data-visualization-with-ggplot.html"><a href="tutorial-data-visualization-with-ggplot.html#scales"><i class="fa fa-check"></i><b>5.4.4</b> Scales</a></li>
<li class="chapter" data-level="5.4.5" data-path="tutorial-data-visualization-with-ggplot.html"><a href="tutorial-data-visualization-with-ggplot.html#themes"><i class="fa fa-check"></i><b>5.4.5</b> Themes</a></li>
<li class="chapter" data-level="5.4.6" data-path="tutorial-data-visualization-with-ggplot.html"><a href="tutorial-data-visualization-with-ggplot.html#labs"><i class="fa fa-check"></i><b>5.4.6</b> Labs</a></li>
<li class="chapter" data-level="5.4.7" data-path="tutorial-data-visualization-with-ggplot.html"><a href="tutorial-data-visualization-with-ggplot.html#facets"><i class="fa fa-check"></i><b>5.4.7</b> Facets</a></li>
<li class="chapter" data-level="5.4.8" data-path="tutorial-data-visualization-with-ggplot.html"><a href="tutorial-data-visualization-with-ggplot.html#saving-graphs"><i class="fa fa-check"></i><b>5.4.8</b> Saving graphs</a></li>
</ul></li>
<li class="chapter" data-level="5.5" data-path="tutorial-data-visualization-with-ggplot.html"><a href="tutorial-data-visualization-with-ggplot.html#other-common-plot-types"><i class="fa fa-check"></i><b>5.5</b> Other common plot types</a>
<ul>
<li class="chapter" data-level="5.5.1" data-path="tutorial-data-visualization-with-ggplot.html"><a href="tutorial-data-visualization-with-ggplot.html#bar-plots"><i class="fa fa-check"></i><b>5.5.1</b> bar plots</a></li>
<li class="chapter" data-level="5.5.2" data-path="tutorial-data-visualization-with-ggplot.html"><a href="tutorial-data-visualization-with-ggplot.html#box-plots"><i class="fa fa-check"></i><b>5.5.2</b> box plots</a></li>
</ul></li>
<li class="chapter" data-level="5.6" data-path="tutorial-data-visualization-with-ggplot.html"><a href="tutorial-data-visualization-with-ggplot.html#take-aways-4"><i class="fa fa-check"></i><b>5.6</b> Take Aways</a></li>
<li class="chapter" data-level="5.7" data-path="tutorial-data-visualization-with-ggplot.html"><a href="tutorial-data-visualization-with-ggplot.html#additional-tutorials-4"><i class="fa fa-check"></i><b>5.7</b> Additional tutorials</a></li>
</ul></li>
<li class="chapter" data-level="" data-path="exercise-4-test-your-knowledge.html"><a href="exercise-4-test-your-knowledge.html"><i class="fa fa-check"></i>Exercise 4: Test your knowledge</a>
<ul>
<li class="chapter" data-level="" data-path="exercise-4-test-your-knowledge.html"><a href="exercise-4-test-your-knowledge.html#task-1-3"><i class="fa fa-check"></i>Task 1</a></li>
<li class="chapter" data-level="" data-path="exercise-4-test-your-knowledge.html"><a href="exercise-4-test-your-knowledge.html#task-2-3"><i class="fa fa-check"></i>Task 2</a></li>
<li class="chapter" data-level="" data-path="exercise-4-test-your-knowledge.html"><a href="exercise-4-test-your-knowledge.html#task-3-3"><i class="fa fa-check"></i>Task 3</a></li>
<li class="chapter" data-level="" data-path="exercise-4-test-your-knowledge.html"><a href="exercise-4-test-your-knowledge.html#task-4-1"><i class="fa fa-check"></i>Task 4</a></li>
<li class="chapter" data-level="" data-path="exercise-4-test-your-knowledge.html"><a href="exercise-4-test-your-knowledge.html#task-5-1"><i class="fa fa-check"></i>Task 5</a></li>
<li class="chapter" data-level="" data-path="exercise-4-test-your-knowledge.html"><a href="exercise-4-test-your-knowledge.html#task-6-1"><i class="fa fa-check"></i>Task 6</a></li>
</ul></li>
<li class="chapter" data-level="6" data-path="tutorial-text-manipulation-with-stringr.html"><a href="tutorial-text-manipulation-with-stringr.html"><i class="fa fa-check"></i><b>6</b> Tutorial: Text manipulation with stringr</a>
<ul>
<li class="chapter" data-level="6.1" data-path="tutorial-text-manipulation-with-stringr.html"><a href="tutorial-text-manipulation-with-stringr.html#whats-stringr"><i class="fa fa-check"></i><b>6.1</b> What’s stringr?</a></li>
<li class="chapter" data-level="6.2" data-path="tutorial-text-manipulation-with-stringr.html"><a href="tutorial-text-manipulation-with-stringr.html#working-with-strings"><i class="fa fa-check"></i><b>6.2</b> Working with strings</a></li>
<li class="chapter" data-level="6.3" data-path="tutorial-text-manipulation-with-stringr.html"><a href="tutorial-text-manipulation-with-stringr.html#working-with-string-patterns"><i class="fa fa-check"></i><b>6.3</b> Working with string patterns</a></li>
<li class="chapter" data-level="6.4" data-path="tutorial-text-manipulation-with-stringr.html"><a href="tutorial-text-manipulation-with-stringr.html#working-with-regular-expressions"><i class="fa fa-check"></i><b>6.4</b> Working with regular expressions</a>
<ul>
<li class="chapter" data-level="6.4.1" data-path="tutorial-text-manipulation-with-stringr.html"><a href="tutorial-text-manipulation-with-stringr.html#anchors-alternates-and-quantifiers"><i class="fa fa-check"></i><b>6.4.1</b> Anchors, alternates, and quantifiers</a></li>
<li class="chapter" data-level="6.4.2" data-path="tutorial-text-manipulation-with-stringr.html"><a href="tutorial-text-manipulation-with-stringr.html#look-arounds-and-groups"><i class="fa fa-check"></i><b>6.4.2</b> Look arounds and groups</a></li>
<li class="chapter" data-level="6.4.3" data-path="tutorial-text-manipulation-with-stringr.html"><a href="tutorial-text-manipulation-with-stringr.html#match-specific-types-of-characters"><i class="fa fa-check"></i><b>6.4.3</b> Match specific types of characters</a></li>
</ul></li>
<li class="chapter" data-level="6.5" data-path="tutorial-text-manipulation-with-stringr.html"><a href="tutorial-text-manipulation-with-stringr.html#take-aways-5"><i class="fa fa-check"></i><b>6.5</b> Take-Aways</a></li>
<li class="chapter" data-level="6.6" data-path="tutorial-text-manipulation-with-stringr.html"><a href="tutorial-text-manipulation-with-stringr.html#additional-tutorials-5"><i class="fa fa-check"></i><b>6.6</b> Additional tutorials</a></li>
</ul></li>
<li class="chapter" data-level="7" data-path="exercise-5-test-your-knowledge.html"><a href="exercise-5-test-your-knowledge.html"><i class="fa fa-check"></i><b>7</b> Exercise 5: Test your knowledge</a>
<ul>
<li class="chapter" data-level="7.1" data-path="exercise-5-test-your-knowledge.html"><a href="exercise-5-test-your-knowledge.html#task-1-4"><i class="fa fa-check"></i><b>7.1</b> Task 1</a></li>
<li class="chapter" data-level="7.2" data-path="exercise-5-test-your-knowledge.html"><a href="exercise-5-test-your-knowledge.html#task-2-4"><i class="fa fa-check"></i><b>7.2</b> Task 2</a></li>
<li class="chapter" data-level="7.3" data-path="exercise-5-test-your-knowledge.html"><a href="exercise-5-test-your-knowledge.html#task-3-4"><i class="fa fa-check"></i><b>7.3</b> Task 3</a></li>
<li class="chapter" data-level="7.4" data-path="exercise-5-test-your-knowledge.html"><a href="exercise-5-test-your-knowledge.html#task-4-2"><i class="fa fa-check"></i><b>7.4</b> Task 4</a></li>
</ul></li>
<li class="chapter" data-level="8" data-path="tutorial-data-collection-with-bardeen.html"><a href="tutorial-data-collection-with-bardeen.html"><i class="fa fa-check"></i><b>8</b> Tutorial: Data Collection with bardeen.ai</a>
<ul>
<li class="chapter" data-level="8.1" data-path="tutorial-data-collection-with-bardeen.html"><a href="tutorial-data-collection-with-bardeen.html#sign-up"><i class="fa fa-check"></i><b>8.1</b> Sign up</a></li>
<li class="chapter" data-level="8.2" data-path="tutorial-data-collection-with-bardeen.html"><a href="tutorial-data-collection-with-bardeen.html#choose-a-playbook"><i class="fa fa-check"></i><b>8.2</b> Choose a Playbook</a></li>
<li class="chapter" data-level="8.3" data-path="tutorial-data-collection-with-bardeen.html"><a href="tutorial-data-collection-with-bardeen.html#run-the-playbook"><i class="fa fa-check"></i><b>8.3</b> Run the Playbook</a></li>
</ul></li>
<li class="chapter" data-level="9" data-path="tutorial-analyze-data-with-stminsights.html"><a href="tutorial-analyze-data-with-stminsights.html"><i class="fa fa-check"></i><b>9</b> Tutorial: Analyze Data with stminsights</a></li>
<li class="chapter" data-level="" data-path="solutions.html"><a href="solutions.html"><i class="fa fa-check"></i>Solutions</a>
<ul>
<li class="chapter" data-level="" data-path="solutions.html"><a href="solutions.html#solutions-for-exercise-1"><i class="fa fa-check"></i>Solutions for Exercise 1</a>
<ul>
<li class="chapter" data-level="" data-path="solutions.html"><a href="solutions.html#task-1-5"><i class="fa fa-check"></i>Task 1</a></li>
<li class="chapter" data-level="" data-path="solutions.html"><a href="solutions.html#task-2-5"><i class="fa fa-check"></i>Task 2</a></li>
<li class="chapter" data-level="" data-path="solutions.html"><a href="solutions.html#task-3-5"><i class="fa fa-check"></i>Task 3</a></li>
</ul></li>
<li class="chapter" data-level="" data-path="solutions.html"><a href="solutions.html#solutions-for-exercise-2"><i class="fa fa-check"></i>Solutions for Exercise 2</a>
<ul>
<li class="chapter" data-level="" data-path="solutions.html"><a href="solutions.html#task-1-6"><i class="fa fa-check"></i>Task 1</a></li>
<li class="chapter" data-level="" data-path="solutions.html"><a href="solutions.html#task-2-6"><i class="fa fa-check"></i>Task 2</a></li>
<li class="chapter" data-level="" data-path="solutions.html"><a href="solutions.html#task-3-6"><i class="fa fa-check"></i>Task 3</a></li>
</ul></li>
<li class="chapter" data-level="" data-path="solutions.html"><a href="solutions.html#solutions-for-exercise-3"><i class="fa fa-check"></i>Solutions for Exercise 3</a>
<ul>
<li class="chapter" data-level="" data-path="solutions.html"><a href="solutions.html#task-1-7"><i class="fa fa-check"></i>Task 1</a></li>
<li class="chapter" data-level="" data-path="solutions.html"><a href="solutions.html#task-2-7"><i class="fa fa-check"></i>Task 2</a></li>
<li class="chapter" data-level="" data-path="solutions.html"><a href="solutions.html#task-3-7"><i class="fa fa-check"></i>Task 3</a></li>
<li class="chapter" data-level="" data-path="solutions.html"><a href="solutions.html#task-4-3"><i class="fa fa-check"></i>Task 4</a></li>
<li class="chapter" data-level="" data-path="solutions.html"><a href="solutions.html#task-5-2"><i class="fa fa-check"></i>Task 5</a></li>
<li class="chapter" data-level="" data-path="solutions.html"><a href="solutions.html#task-6-2"><i class="fa fa-check"></i>Task 6</a></li>
<li class="chapter" data-level="" data-path="solutions.html"><a href="solutions.html#task-7-1"><i class="fa fa-check"></i>Task 7</a></li>
</ul></li>
<li class="chapter" data-level="" data-path="solutions.html"><a href="solutions.html#solutions-for-exercise-4"><i class="fa fa-check"></i>Solutions for Exercise 4</a>
<ul>
<li class="chapter" data-level="" data-path="solutions.html"><a href="solutions.html#task-1-8"><i class="fa fa-check"></i>Task 1</a></li>
<li class="chapter" data-level="" data-path="solutions.html"><a href="solutions.html#task-2-8"><i class="fa fa-check"></i>Task 2</a></li>
<li class="chapter" data-level="" data-path="solutions.html"><a href="solutions.html#task-3-8"><i class="fa fa-check"></i>Task 3</a></li>
<li class="chapter" data-level="" data-path="solutions.html"><a href="solutions.html#task-4-4"><i class="fa fa-check"></i>Task 4</a></li>
<li class="chapter" data-level="" data-path="solutions.html"><a href="solutions.html#task-5-3"><i class="fa fa-check"></i>Task 5</a></li>
<li class="chapter" data-level="" data-path="solutions.html"><a href="solutions.html#task-6-3"><i class="fa fa-check"></i>Task 6</a></li>
</ul></li>
<li class="chapter" data-level="" data-path="solutions.html"><a href="solutions.html#solutions-for-exercise-5"><i class="fa fa-check"></i>Solutions for Exercise 5</a>
<ul>
<li class="chapter" data-level="" data-path="solutions.html"><a href="solutions.html#task-1-9"><i class="fa fa-check"></i>Task 1</a></li>
<li class="chapter" data-level="" data-path="solutions.html"><a href="solutions.html#task-2-9"><i class="fa fa-check"></i>Task 2</a></li>
<li class="chapter" data-level="" data-path="solutions.html"><a href="solutions.html#task-3-9"><i class="fa fa-check"></i>Task 3</a></li>
<li class="chapter" data-level="" data-path="solutions.html"><a href="solutions.html#task-4-5"><i class="fa fa-check"></i>Task 4</a></li>
</ul></li>
</ul></li>
<li class="divider"></li>
<li><a href="https://github.com/LKobilke/boardgame.git" target="blank">Published with bookdown</a></li>
</ul>
</nav>
</div>
<div class="book-body">
<div class="body-inner">
<div class="book-header" role="navigation">
<h1>
<i class="fa fa-circle-o-notch fa-spin"></i><a href="./">Automated procedures for analyzing business communication strategies: Focusing on the board game industry</a>
</h1>
</div>
<div class="page-wrapper" tabindex="-1" role="main">
<div class="page-inner">
<section class="normal" id="section-">
<div id="tutorial-text-manipulation-with-stringr" class="section level1 hasAnchor" number="6">
<h1><span class="header-section-number"> 6</span> Tutorial: Text manipulation with stringr<a href="tutorial-text-manipulation-with-stringr.html#tutorial-text-manipulation-with-stringr" class="anchor-section" aria-label="Anchor link to header"></a></h1>
<p><strong>After working through Tutorial 6, you’ll…</strong></p>
<ul>
<li>understand the concept of string patterns and regular expressions</li>
<li>know how to search for string patterns</li>
</ul>
<div id="whats-stringr" class="section level2 hasAnchor" number="6.1">
<h2><span class="header-section-number">6.1</span> What’s stringr?<a href="tutorial-text-manipulation-with-stringr.html#whats-stringr" class="anchor-section" aria-label="Anchor link to header"></a></h2>
<p>The <code>stringr</code> package is another package of the <code>tidyverse</code> family,
i.e., it comes pre-installed with the tidyverse. The package offers a
neat set of functions that makes working with strings really simple for
beginners. Therefore, <code>stringr</code> is a good place to start getting into
text data management. A string is <em>a data type that is used to represent
text rather than numbers</em>.</p>
</div>
<div id="working-with-strings" class="section level2 hasAnchor" number="6.2">
<h2><span class="header-section-number">6.2</span> Working with strings<a href="tutorial-text-manipulation-with-stringr.html#working-with-strings" class="anchor-section" aria-label="Anchor link to header"></a></h2>
<p>Let’s create a vector containing some email addresses and print them to the console.</p>
<div class="sourceCode" id="cb188"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb188-1"><a href="tutorial-text-manipulation-with-stringr.html#cb188-1" tabindex="-1"></a>emails <span class="ot"><-</span> <span class="fu">c</span>(<span class="st">"mark@example.com"</span>, <span class="st">"john@example.com"</span>, <span class="st">"lisa@example.org"</span>, <span class="st">"peter@example.net"</span>, <span class="st">"jane@example.co.uk"</span>, <span class="st">"tim@example.edu"</span>, <span class="st">"emmajay@example.gov"</span>)</span>
<span id="cb188-2"><a href="tutorial-text-manipulation-with-stringr.html#cb188-2" tabindex="-1"></a></span>
<span id="cb188-3"><a href="tutorial-text-manipulation-with-stringr.html#cb188-3" tabindex="-1"></a>emails</span></code></pre></div>
<pre><code>## [1] "mark@example.com" "john@example.com" "lisa@example.org"
## [4] "peter@example.net" "jane@example.co.uk" "tim@example.edu"
## [7] "emmajay@example.gov"</code></pre>
<p>We can use the <code>str_length()</code> function to find out how many characters each email address contains.</p>
<div class="sourceCode" id="cb190"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb190-1"><a href="tutorial-text-manipulation-with-stringr.html#cb190-1" tabindex="-1"></a><span class="fu">str_length</span>(emails)</span></code></pre></div>
<pre><code>## [1] 16 16 16 17 18 15 19</code></pre>
<p>That was easy. Next, we want to join multiple strings into a single
string. We will use the <code>str_c()</code> function.</p>
<div class="sourceCode" id="cb192"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb192-1"><a href="tutorial-text-manipulation-with-stringr.html#cb192-1" tabindex="-1"></a><span class="fu">str_c</span>(emails, <span class="at">collapse =</span> <span class="st">" and "</span>)</span></code></pre></div>
<pre><code>## [1] "mark@example.com and john@example.com and lisa@example.org and peter@example.net and jane@example.co.uk and tim@example.edu and emmajay@example.gov"</code></pre>
<div class="sourceCode" id="cb194"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb194-1"><a href="tutorial-text-manipulation-with-stringr.html#cb194-1" tabindex="-1"></a><span class="co"># If collapse is not NULL, it will be inserted between elements of the result, here: and</span></span></code></pre></div>
<p>You can also change the order of the <code>str_c</code> function:</p>
<div class="sourceCode" id="cb195"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb195-1"><a href="tutorial-text-manipulation-with-stringr.html#cb195-1" tabindex="-1"></a><span class="fu">str_c</span>(<span class="st">"My favourite correspondent is: "</span>, emails, <span class="at">collapse =</span> <span class="cn">NULL</span>) </span></code></pre></div>
<pre><code>## [1] "My favourite correspondent is: mark@example.com"
## [2] "My favourite correspondent is: john@example.com"
## [3] "My favourite correspondent is: lisa@example.org"
## [4] "My favourite correspondent is: peter@example.net"
## [5] "My favourite correspondent is: jane@example.co.uk"
## [6] "My favourite correspondent is: tim@example.edu"
## [7] "My favourite correspondent is: emmajay@example.gov"</code></pre>
<p>If we want to extract substrings from a character vector, for example, keeping only the user part of each email address, we can use the <code>str_sub()</code> function.</p>
<div class="sourceCode" id="cb197"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb197-1"><a href="tutorial-text-manipulation-with-stringr.html#cb197-1" tabindex="-1"></a><span class="fu">str_sub</span>(emails, <span class="at">start =</span> <span class="dv">1</span>, <span class="at">end =</span> <span class="fu">str_locate</span>(emails, <span class="st">"@"</span>) <span class="sc">-</span> <span class="dv">1</span>) </span></code></pre></div>
<pre><code>## [1] "mark" "john" "lisa" "peter" "jane" "tim" "emmajay"
## [8] "mark" "john" "lisa" "peter" "jane" "tim" "emmajay"</code></pre>
</div>
<div id="working-with-string-patterns" class="section level2 hasAnchor" number="6.3">
<h2><span class="header-section-number">6.3</span> Working with string patterns<a href="tutorial-text-manipulation-with-stringr.html#working-with-string-patterns" class="anchor-section" aria-label="Anchor link to header"></a></h2>
<p>Often we want to search for certain string patterns in a text document.
<strong>String patterns</strong> are character sequences (for instance, letter,
numbers, or special characters). Suppose we have some email addresses where ‘@’ was wrongly replaced with ‘#’.</p>
<div class="sourceCode" id="cb199"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb199-1"><a href="tutorial-text-manipulation-with-stringr.html#cb199-1" tabindex="-1"></a>wrong_emails <span class="ot"><-</span> <span class="fu">c</span>(<span class="st">"mark#example.com"</span>, <span class="st">"john@example.com"</span>, <span class="st">"lisa@example.org"</span>, <span class="st">"peter@example.net"</span>, <span class="st">"jane@example.co.uk"</span>, <span class="st">"tim@example.edu"</span>, <span class="st">"emmajay@example.gov"</span>)</span></code></pre></div>
<p>We can search for the string pattern “#” and replace it with “@” automatically. Let’s first identify the emails that contain ‘#’ using <code>str_detect()</code>.</p>
<div class="sourceCode" id="cb200"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb200-1"><a href="tutorial-text-manipulation-with-stringr.html#cb200-1" tabindex="-1"></a><span class="fu">str_detect</span>(wrong_emails, <span class="st">"#"</span>)</span></code></pre></div>
<pre><code>## [1] TRUE FALSE FALSE FALSE FALSE FALSE FALSE</code></pre>
<p>We can extract the email addresses that contain ‘#’ using <code>str_subset()</code>.</p>
<div class="sourceCode" id="cb202"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb202-1"><a href="tutorial-text-manipulation-with-stringr.html#cb202-1" tabindex="-1"></a><span class="fu">str_subset</span>(wrong_emails, <span class="st">"#"</span>)</span></code></pre></div>
<pre><code>## [1] "mark#example.com"</code></pre>
<p>We can fix the wrong email addresses using the <code>str_replace()</code> function.</p>
<div class="sourceCode" id="cb204"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb204-1"><a href="tutorial-text-manipulation-with-stringr.html#cb204-1" tabindex="-1"></a>corrected_emails <span class="ot"><-</span> <span class="fu">str_replace</span>(wrong_emails, <span class="st">"#"</span>, <span class="st">"@"</span>)</span>
<span id="cb204-2"><a href="tutorial-text-manipulation-with-stringr.html#cb204-2" tabindex="-1"></a></span>
<span id="cb204-3"><a href="tutorial-text-manipulation-with-stringr.html#cb204-3" tabindex="-1"></a>corrected_emails</span></code></pre></div>
<pre><code>## [1] "mark@example.com" "john@example.com" "lisa@example.org"
## [4] "peter@example.net" "jane@example.co.uk" "tim@example.edu"
## [7] "emmajay@example.gov"</code></pre>
<p>As a final step, we can split an email address into user and domain parts based on the ‘@’ character using the <code>str_split()</code> function:</p>
<div class="sourceCode" id="cb206"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb206-1"><a href="tutorial-text-manipulation-with-stringr.html#cb206-1" tabindex="-1"></a><span class="fu">str_split</span>(corrected_emails, <span class="st">"@"</span>)</span></code></pre></div>
<pre><code>## [[1]]
## [1] "mark" "example.com"
##
## [[2]]
## [1] "john" "example.com"
##
## [[3]]
## [1] "lisa" "example.org"
##
## [[4]]
## [1] "peter" "example.net"
##
## [[5]]
## [1] "jane" "example.co.uk"
##
## [[6]]
## [1] "tim" "example.edu"
##
## [[7]]
## [1] "emmajay" "example.gov"</code></pre>
</div>
<div id="working-with-regular-expressions" class="section level2 hasAnchor" number="6.4">
<h2><span class="header-section-number">6.4</span> Working with regular expressions<a href="tutorial-text-manipulation-with-stringr.html#working-with-regular-expressions" class="anchor-section" aria-label="Anchor link to header"></a></h2>
<p>Often, we want to match more complicated string patterns than a simple
“#”. For example, we might wish to detect all strings in our mail list that do start with “L”, because we want to email those people.</p>
<p>To search and match complex string patterns, we need regular
expressions. <strong>Regular expressions</strong> (short: <em>regex</em>) are a concise
language for describing patterns of text. Regex should not be taken
literally, but have a non-literal meaning.</p>
<div id="anchors-alternates-and-quantifiers" class="section level3 hasAnchor" number="6.4.1">
<h3><span class="header-section-number">6.4.1</span> Anchors, alternates, and quantifiers<a href="tutorial-text-manipulation-with-stringr.html#anchors-alternates-and-quantifiers" class="anchor-section" aria-label="Anchor link to header"></a></h3>
<p>Let’s search for all email addresses that start with the letter ‘j’. To do this, we will first learn what “anchors” are. Anchors are used to match
the beginning or end of a string, i.e., they denote the start or end of a pattern and are usually used in combination with other patterns. First, let’s look for all strings that start with the
letter <em>j</em> using the <code>^</code> (start of string) anchor.</p>
<div class="sourceCode" id="cb208"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb208-1"><a href="tutorial-text-manipulation-with-stringr.html#cb208-1" tabindex="-1"></a><span class="fu">str_detect</span>(emails, <span class="st">"^j"</span>) <span class="co"># ^ stands for "start of string", i.e. we are matching for strings that start with the letter j</span></span></code></pre></div>
<pre><code>## [1] FALSE TRUE FALSE FALSE TRUE FALSE FALSE</code></pre>
<p>To find all emails that end with “.com”:</p>
<div class="sourceCode" id="cb210"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb210-1"><a href="tutorial-text-manipulation-with-stringr.html#cb210-1" tabindex="-1"></a><span class="fu">str_detect</span>(emails, <span class="st">"com$"</span>) <span class="co"># $ stands for "end of string", i.e. we are matching for strings that end with the letters com</span></span></code></pre></div>
<pre><code>## [1] TRUE TRUE FALSE FALSE FALSE FALSE FALSE</code></pre>
<p>Please, note the difference to not matching these two regexes (<code>^</code> and <code>$</code>), but the simple string pattern “<em>j</em>”:</p>
<div class="sourceCode" id="cb212"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb212-1"><a href="tutorial-text-manipulation-with-stringr.html#cb212-1" tabindex="-1"></a><span class="fu">str_detect</span>(emails, <span class="st">"j"</span>) <span class="co"># matches all strings that contain the letter j at any position</span></span></code></pre></div>
<pre><code>## [1] FALSE TRUE FALSE FALSE TRUE FALSE TRUE</code></pre>
<p>Finally, you should also not confuse “<em>^j</em>” with “<em>[^j]</em>”, because <code>[^]</code> stands for “anything but” in regex language.</p>
<div class="sourceCode" id="cb214"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb214-1"><a href="tutorial-text-manipulation-with-stringr.html#cb214-1" tabindex="-1"></a><span class="fu">str_detect</span>(emails, <span class="st">"[^j]"</span>) <span class="co"># matches all strings that contain any letters different from j(s)</span></span></code></pre></div>
<pre><code>## [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE</code></pre>
<p>Next, let’s look at “alternates” in regex. Alternates are used to match multiple characters in a single position. They are specified in square brackets, with a list of characters that could be matched. For example, the regular expression “[mj]ohn” will match any string that starts with m, or j, followed by ohn. This means it will match both <em>mohn</em> and <em>john.</em></p>
<p>Let’s look at our fruits again and learn about the <strong>very</strong> powerful “match any one of” regex <code>[]</code>.
We will now try to find all the emails that contain the letter m or the letter j at any position of the string.</p>
<div class="sourceCode" id="cb216"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb216-1"><a href="tutorial-text-manipulation-with-stringr.html#cb216-1" tabindex="-1"></a><span class="fu">str_detect</span>(emails, <span class="st">"[mj]"</span>) <span class="co"># matches all strings that contain either the letter m or the letter j</span></span></code></pre></div>
<pre><code>## [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE</code></pre>
<p>Next, we can match all emails that contain letters that range between p to z in the ABC. We will need to use the range operator <code>[-]</code>:</p>
<div class="sourceCode" id="cb218"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb218-1"><a href="tutorial-text-manipulation-with-stringr.html#cb218-1" tabindex="-1"></a><span class="fu">str_detect</span>(emails, <span class="st">"[j-m]"</span>) <span class="co"># matches all strings that contain either the letter j, k, l, m</span></span></code></pre></div>
<pre><code>## [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE</code></pre>
<p>The <code>?</code> operator (zero or one) <code>*</code> operator (zero or more), the <code>+</code> operator
(one or more), and the <code>{n}</code> operator (exactly n) are also handy. These regex are called “quantifiers” and are used to specify how many of the preceding characters should be matched in a string.</p>
<div class="sourceCode" id="cb220"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb220-1"><a href="tutorial-text-manipulation-with-stringr.html#cb220-1" tabindex="-1"></a><span class="fu">str_detect</span>(emails, <span class="st">"a?"</span>) <span class="co"># matches all strings that contain zero or one a</span></span></code></pre></div>
<pre><code>## [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE</code></pre>
<div class="sourceCode" id="cb222"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb222-1"><a href="tutorial-text-manipulation-with-stringr.html#cb222-1" tabindex="-1"></a><span class="fu">str_detect</span>(emails, <span class="st">"a*"</span>) <span class="co"># matches all strings that contain zero or more a</span></span></code></pre></div>
<pre><code>## [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE</code></pre>
<div class="sourceCode" id="cb224"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb224-1"><a href="tutorial-text-manipulation-with-stringr.html#cb224-1" tabindex="-1"></a><span class="fu">str_detect</span>(emails, <span class="st">"r+"</span>) <span class="co"># matches all strings that contain one or more as --> this grabs pear as well, not there yet!</span></span></code></pre></div>
<pre><code>## [1] TRUE FALSE TRUE TRUE FALSE FALSE FALSE</code></pre>
<p>Lastly, let’s find all emails that contain exactly two a’s:</p>
<div class="sourceCode" id="cb226"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb226-1"><a href="tutorial-text-manipulation-with-stringr.html#cb226-1" tabindex="-1"></a><span class="fu">str_detect</span>(emails, <span class="st">"a{2}"</span>) <span class="co"># matches all strings that contain exactly two as</span></span></code></pre></div>
<pre><code>## [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE</code></pre>
<p>This looks great! We now know what what anchors, alternates, and quantifiers in regex are!</p>
</div>
<div id="look-arounds-and-groups" class="section level3 hasAnchor" number="6.4.2">
<h3><span class="header-section-number">6.4.2</span> Look arounds and groups<a href="tutorial-text-manipulation-with-stringr.html#look-arounds-and-groups" class="anchor-section" aria-label="Anchor link to header"></a></h3>
<p>Another useful type of regex are “look arounds” and “groups”. Look arounds are used to match patterns before or after a certain character or pattern, and groups are used to group elements of a pattern and specify the order in which they should be evaluated.</p>
<p>We can leverage look arounds to find certain characters or patterns in our emails. For example, let’s say we want to find all emails where the letter ‘a’ is immediately followed by the letter ‘c’. This is where the “followed by” regex (<code>?=</code>) comes in.</p>
<div class="sourceCode" id="cb228"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb228-1"><a href="tutorial-text-manipulation-with-stringr.html#cb228-1" tabindex="-1"></a><span class="fu">str_detect</span>(emails, <span class="st">"a(?=c)"</span>) <span class="co"># matches all strings that contain an 'a' followed by a 'c'</span></span></code></pre></div>
<pre><code>## [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE</code></pre>
<p>Alternatively, we can use a “not followed by” look around (?!). For instance, let’s find all emails where ‘a’ is not followed by ‘c’:</p>
<div class="sourceCode" id="cb230"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb230-1"><a href="tutorial-text-manipulation-with-stringr.html#cb230-1" tabindex="-1"></a><span class="fu">str_detect</span>(emails, <span class="st">"a(?!c)"</span>) <span class="co"># matches all strings that contain an 'a' not followed by a 'c'</span></span></code></pre></div>
<pre><code>## [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE</code></pre>
<p>Another useful look around is “preceded by” (?<=). This look around finds patterns that are preceded by a specific string or character. As an example, we can find all emails where ‘r’ is preceded by ‘e’:</p>
<div class="sourceCode" id="cb232"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb232-1"><a href="tutorial-text-manipulation-with-stringr.html#cb232-1" tabindex="-1"></a><span class="fu">str_detect</span>(emails, <span class="st">"(?<=e)r"</span>) <span class="co"># matches all strings that contain an 'r' preceeded by an 'e'</span></span></code></pre></div>
<pre><code>## [1] FALSE FALSE FALSE TRUE FALSE FALSE FALSE</code></pre>
<p>We can also negate this into “not preceeded by” with the help of !:</p>
<div class="sourceCode" id="cb234"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb234-1"><a href="tutorial-text-manipulation-with-stringr.html#cb234-1" tabindex="-1"></a><span class="fu">str_detect</span>(emails, <span class="st">"(?<!e)r"</span>) <span class="co"># matches all strings that contain an 'r' that is not preceeded by an 'e'</span></span></code></pre></div>
<pre><code>## [1] TRUE FALSE TRUE FALSE FALSE FALSE FALSE</code></pre>
<p>Next, let’s discuss “groups”. These are used to group elements of a pattern and specify the order in which they should be evaluated.</p>
<div class="sourceCode" id="cb236"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb236-1"><a href="tutorial-text-manipulation-with-stringr.html#cb236-1" tabindex="-1"></a><span class="fu">str_detect</span>(emails, <span class="st">"(e|a)r"</span>) <span class="co"># matches all strings that contain an 'e' or an 'a' followed by an 'r'</span></span></code></pre></div>
<pre><code>## [1] TRUE FALSE FALSE TRUE FALSE FALSE FALSE</code></pre>
<p>Remember the “alternates” regex we discussed above? We can also use it here:</p>
<div class="sourceCode" id="cb238"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb238-1"><a href="tutorial-text-manipulation-with-stringr.html#cb238-1" tabindex="-1"></a><span class="fu">str_detect</span>(emails, <span class="st">"[ea]r"</span>) <span class="co"># matches all strings that contain an 'e' or an 'a' which are followed by an 'r'</span></span></code></pre></div>
<pre><code>## [1] TRUE FALSE FALSE TRUE FALSE FALSE FALSE</code></pre>
<p>However, groups can become handy if you need to chain multiple groups for complex patterns. This is more advanced, but it’s worth knowing about.</p>
<p>This was a lot, so here comes an overview / summary of all the regex (screenshot of the <code>stringr</code> cheat sheet):</p>
<table>
<colgroup>
<col width="100%" />
</colgroup>
<tbody>
<tr class="odd">
<td align="left"><em>Screenshot of anchors, alternates, quantifiers, look arounds, and groups (stringr cheat sheet):</em></td>
</tr>
<tr class="even">
<td align="left"><img src="images/stringr2.JPG" /></td>
</tr>
</tbody>
</table>
</div>
<div id="match-specific-types-of-characters" class="section level3 hasAnchor" number="6.4.3">
<h3><span class="header-section-number">6.4.3</span> Match specific types of characters<a href="tutorial-text-manipulation-with-stringr.html#match-specific-types-of-characters" class="anchor-section" aria-label="Anchor link to header"></a></h3>
<p>There is one last thing I would like to teach you. It’s how to match specific types of characters (e.g., numbers, space symbols, etc.):</p>
<div class="sourceCode" id="cb240"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb240-1"><a href="tutorial-text-manipulation-with-stringr.html#cb240-1" tabindex="-1"></a><span class="fu">str_detect</span>(emails, <span class="st">"[:digit:]"</span>) <span class="co"># matches all strings that contain a number</span></span></code></pre></div>
<pre><code>## [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE</code></pre>
<p>We can also detect email addresses that falsely contain a space symbol:</p>
<div class="sourceCode" id="cb242"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb242-1"><a href="tutorial-text-manipulation-with-stringr.html#cb242-1" tabindex="-1"></a><span class="fu">str_detect</span>(emails, <span class="st">"[:space:]"</span>) <span class="co"># matches all strings that contain a space</span></span></code></pre></div>
<pre><code>## [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE</code></pre>
<table>
<colgroup>
<col width="100%" />
</colgroup>
<tbody>
<tr class="odd">
<td align="left"><em>Screenshot of types of characters that can be matched in stringr (stringr cheat sheet):</em></td>
</tr>
<tr class="even">
<td align="left"><img src="images/stringr1.JPG" /></td>
</tr>
</tbody>
</table>
<p>Finished! Repeat and practice these commands and you will become a real expert on regular expressions! This tutorial provided an in-depth overview of regular expressions, with only a very few aspects left to explore. If you feel like you need to look up some regular expressions, you can find them in this awesome <a href="https://github.com/rstudio/cheatsheets/blob/main/strings.pdf">stringr cheat sheet</a>.</p>
</div>
</div>
<div id="take-aways-5" class="section level2 hasAnchor" number="6.5">
<h2><span class="header-section-number">6.5</span> Take-Aways<a href="tutorial-text-manipulation-with-stringr.html#take-aways-5" class="anchor-section" aria-label="Anchor link to header"></a></h2>
<ul>
<li><strong>String patterns & RegEx</strong>: String patterns are sequences of
characters; regular expressions are a type of abstract, generalized string pattern used
to match or detect other string patterns in texts.</li>
<li><strong>Important regular expressions</strong>: Anchors, alternates, quantifiers, look arounds and groups.
The best overview of all regex options can be found in the <a href="https://github.com/rstudio/cheatsheets/blob/main/strings.pdf">stringr cheat sheet</a>.</li>
</ul>
</div>
<div id="additional-tutorials-5" class="section level2 hasAnchor" number="6.6">
<h2><span class="header-section-number">6.6</span> Additional tutorials<a href="tutorial-text-manipulation-with-stringr.html#additional-tutorials-5" class="anchor-section" aria-label="Anchor link to header"></a></h2>
<p>You still have questions? The following tutorials, books, & tools may
help you:</p>
<ul>
<li><a href="https://sicss.io/2019/materials/day3-text-analysis/topic-modeling/rmarkdown/Topic_Modeling.html">Stringr Cheat
Sheet</a></li>
<li><a href="https://regexr.com/">Regexr - Learn, build, and test RegEx</a></li>
<li><a href="https://bookdown.org/valerie_hase/TextasData_HS2021/tutorial-9-searching-manipulating-string-patterns.html">Text as Data by V. Hase, Tutorial
9</a></li>
</ul>
</div>
</div>
</section>
</div>
</div>
</div>
<a href="exercise-4-test-your-knowledge.html" class="navigation navigation-prev " aria-label="Previous page"><i class="fa fa-angle-left"></i></a>
<a href="exercise-5-test-your-knowledge.html" class="navigation navigation-next " aria-label="Next page"><i class="fa fa-angle-right"></i></a>
</div>
</div>
<script src="libs/gitbook-2.6.7/js/app.min.js"></script>
<script src="libs/gitbook-2.6.7/js/clipboard.min.js"></script>
<script src="libs/gitbook-2.6.7/js/plugin-search.js"></script>
<script src="libs/gitbook-2.6.7/js/plugin-sharing.js"></script>
<script src="libs/gitbook-2.6.7/js/plugin-fontsettings.js"></script>
<script src="libs/gitbook-2.6.7/js/plugin-bookdown.js"></script>
<script src="libs/gitbook-2.6.7/js/jquery.highlight.js"></script>
<script src="libs/gitbook-2.6.7/js/plugin-clipboard.js"></script>
<script>
gitbook.require(["gitbook"], function(gitbook) {
gitbook.start({
"sharing": {
"github": false,
"facebook": true,
"twitter": true,
"linkedin": false,
"weibo": false,
"instapaper": false,
"vk": false,
"whatsapp": false,
"all": ["facebook", "twitter", "linkedin", "weibo", "instapaper"]
},
"fontsettings": {
"theme": "white",
"family": "sans",
"size": 2
},
"edit": {
"link": "https://github.com/LKobilke/boardgame/edit/main/index.Rmd",
"text": "Edit"
},
"history": {
"link": null,
"text": null
},
"view": {
"link": null,
"text": null
},
"download": ["Automated Analysis.pdf"],
"search": {
"engine": "fuse",
"options": null
},
"toc": {
"collapse": "subsection"
}
});
});
</script>
</body>
</html>