-
Notifications
You must be signed in to change notification settings - Fork 0
/
papers_transfer_type.csv
We can make this file beautiful and searchable if this error is corrected: Illegal quoting in line 2.
337 lines (337 loc) · 177 KB
/
papers_transfer_type.csv
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
ID\Name\URL\TransferExperimentType\TransferExperimentSubType\TransferDataType\TransferPerformanceMetrics\SuccessWithmetric\RL Implementation\RL Policy Type\Inter-Task Mappings\Autonomous Transfer?\Is Deep RL?\Special Novelty?\Application Field\Allowed Learners\Paper available?\Behind Paywall?\PDF available somewhere without paywall?\Paper Origin Country\Uni Name\Uni department\Paper for a Thesis? (0 = No, 1 = Is Thesis, 2 = Master, else ID)\Source Task Selection\Was in Survey (0 = No, 1 = yes and also here, 2 = yes but not in dataset)\Has transfer in Title\Has Transfer in Abstract\Has Transfer in Text\Additional Tags\Transfer-From-To\1 Rejected, 2 Unpublished\1 = part of phd, 2 = same but different length, 3 = duplicate, 4 = prepublish before conference acceptance, 5 = partially reused results, 6 = only some text overlap but new results, 7 = updated version, 8 = extended with partial figure reuse\2D Nav Level
2962858248\Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning\Some("https://ui.adsabs.harvard.edu/abs/2015arXiv151106342P/abstract")\["p", "r", "s", "#", "a"]\["lit"]\["weights"]\["j", "tt"]\\["Custom", "CS", "Theorem", "Lemma", "Formulas", "Figures", "Tables"]\["DQN", "AMN"]\["N/A"]\1\1\\["Simulation", "Games", "VideoGames", "Atari", "Generalization"]\["TD"]\1\0\0\["Canada"]\["University of Toronto"]\["Computer Science"]\0\["all"]\["mag", "zhu", "multi"]\["t", "r", "l", "ta"]\["t", "r", "l", "k", "de", "do", "g", "ta"]\["t", "r", "l", "k", "de", "do", "ta"]\\[["Atari", "Atari"]]\\\
2440926996\Successor Features for Transfer in Reinforcement Learning\Some("https://export.arxiv.org/pdf/1606.05312")\["s", "s_i", "s_f", "t"]\["multi"]\["fea", "pi"]\["j", "tt"]\\["Custom", "CS", "Pseudo", "Theorem", "Formulas", "Figures", "Tables", "Lemma"]\["DQN", "SFDQN", "SFQL", "PRQL"]\["N/A"]\0\1\\["2D", "Navigation", "Collect", "Simulation", "Robotics", "Mujoco", "Arm"]\["TD", "H"]\1\0\0\["USA"]\["DeepMind"]\["Industry"]\0\["all"]\["mag", "zhu"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k", "s"]\\\\\
2605368761\Learning Invariant Feature Spaces to Transfer Skills with Reinforcement Learning\Some("https://openreview.net/pdf?id=Hyq4yhile")\["#", "t", "v", "a"]\["diff-no"]\["fea"]\["j", "tt", "ap"]\\["Custom", "CS", "Mujoco", "Figures", "Formulas", "Tables", "Videos"]\["PG"]\["exp"]\0\1\\["Robotics", "Simulation", "Mujoco", "Button", "Pull", "Push", "MultiAgent"]\["TD"]\1\0\0\["USA"]\["University of California", "OpenAI"]\["Electrical Engineering and Computer Science", "Industry", "Uni", "Colab"]\0\["all"]\["mag", "silva", "zhu"]\["t", "r", "l"]\["t", "r", "l", "k", "s"]\["t", "r", "l", "k", "s"]\\\\\
2962985403\Universal Successor Representations for Transfer Reinforcement Learning\Some("https://openreview.net/pdf?id=HJ_CpYyDz")\["s_i", "s_f", "levels"]\["multi"]\["r", "fea"]\["ap", "tr", "tt", "j"]\\["Custom", "CS", "Pseudo", "Formulas", "Figures"]\["USRA", "AC"]\["exp"]\0\1\\["2D", "Navigation", "Grid", "Simulation", "Successor"]\["TD"]\1\0\0\["USA", "Canada"]\["University of Alberta", "University of Montreal"]\["Computer Science"]\0\["all"]\["mag", "zhu"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
2963611966\DARLA: improving zero-shot transfer in reinforcement learning\Some("https://www.arxiv-vanity.com/papers/1707.08475/")\["t", "s_i", "s_f", "levels"]\["diff-no", "sim2real"]\["fea", "curriculum"]\["ap", "tr", "tt", "j"]\\["Custom", "CS", "Formulas", "Figures", "Tables", "Videos", "TensorFlow", "Mujoco", "DeepMind Lab"]\["DQN", "A3C", "EC", "UNREAL"]\["exp"]\1\1\\["Simulation", "Representation", "JacoArm", "DeepMind Lab", "Robotics"]\["TD"]\1\0\0\["UK"]\["DeepMind"]\["Industry"]\0\["all"]\["mag", "zhu"]\["t", "r", "l"]\["r", "l"]\["t", "r", "l", "k"]\\\\\
158722652\Transfer in Reinforcement Learning: a Framework and a Survey\Some("https://doi.org/10.1007/978-3-642-27645-3_5")\[]\["theory", "survey"]\[]\[]\\[]\[]\[]\0\0\\["theory", "survey"]\[]\1\0\0\["France"]\["University of Lille"]\["INRIA Lille"]\0\[]\["mag", "silva", "cur"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k", "s"]\\\\\
1492014007\Building portable options: skill transfer in reinforcement learning\Some("http://ijcai.org/Proceedings/07/Papers/144.pdf")\["p"]\["diff-no"]\["options"]\["j", "tr"]\\["Custom", "CS", "Figures", "Formulas"]\["SARSA"]\["N/A"]\0\0\\["2D", "Navigation", "Collect", "Simulation"]\["TD", "H"]\1\0\0\["USA"]\["University of Massachusetts"]\["Computer Science"]\0\["h"]\["taylor", "mag", "lazaric", "bone"]\["t", "r", "l"]\["t", "l"]\["t", "r", "l"]\\[]\\\["Grid"]
2164114810\Cross-domain transfer for reinforcement learning\Some("https://www.researchgate.net/profile/Matthew_Taylor12/publication/221345060_Cross-domain_transfer_for_reinforcement_learning/links/0fcfd5114478d8d544000000.pdf")\["a", "r", "v"]\["diff-it"]\["rule", "advisor"]\["j", "tt", "tr"]\\["Custom", "CS", "Figures", "Formulas", "Pseudo", "Tables"]\["Q"]\["sup"]\0\0\\["Simulation", "RoboCup", "2D", "Navigation", "Grid", "KnightJoust", "Ringworld", "MultiAgent", "KeepAway", "3v2"]\["TD", "any"]\1\0\0\["USA"]\["The University of Texas"]\["Computer Science"]\0\["h"]\["taylor", "mag", "bone"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k", "s"]\\[["Ringworld", "KeepAway3v2"], ["KnightJoust", "KeepAway3v2"]]\\\
2079247031\Autonomous shaping: knowledge transfer in reinforcement learning\Some("http://portal.acm.org/citation.cfm?doid=1143844.1143906")\["p"]\["diff-no"]\["r"]\["j", "tr"]\\["Custom", "CS", "Figures", "Formulas"]\["SARSA"]\["N/A"]\0\0\\["2D", "Navigation", "Simulation", "Homing"]\["TD"]\1\0\0\["USA"]\["University of Massachusetts"]\["Computer Science"]\0\["h"]\["taylor", "mag", "lazaric", "taylor-intertask", "bone", "silva", "zhu"]\["t", "r", "l", "k"]\["t", "r", "l"]\["t", "r", "l", "k"]\\[]\\\["Obstacles"]
2795717084\StarCraft Micromanagement with Reinforcement Learning and Curriculum Transfer Learning\Some("http://export.arxiv.org/abs/1804.00810")\["s_i", "s_f", "levels", "t"]\["multi"]\["curriculum"]\["j", "tr", "tt", "ap"]\\["Custom", "OSS", "bwapi", "Pseudo", "Figures", "Formulas", "Tables"]\["PS-MAGDS", "SARSA"]\["N/A"]\0\1\\["Simulation", "Games", "VideoGames", "StarCraft", "Micromanagement", "Curriculum", "MultiAgent"]\["TD", "H", "MB"]\1\0\0\["China"]\["Chinese Academy of Sciences", "State Key Laboratory of Management and Control for Complex Systems"]\["Institute of Automation"]\0\["all"]\["mag", "cur"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k", "s"]\\\\\
2126565096\Behavior transfer for value-function-based reinforcement learning\Some("http://dl.acm.org/citation.cfm?doid=1082473.1082482")\["v", "#"]\["diff-it"]\["Q"]\["j", "tt"]\\["Custom", "CS", "RoboCup", "Formulas", "Videos", "Figures", "Tables"]\["SARSA", "CMAC", "SMDP"]\["sup"]\0\0\\["2D", "Simulation", "RoboCup", "KeepAway", "MultiAgent", "4v3", "3v2"]\["TD"]\1\0\0\["USA"]\["The University of Texas"]\["Compouter Sciences"]\0\["h"]\["mag", "lazaric", "cur"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\[["KeepAway3v2", "KeepAway4v3"]]\\\
2950819666\Zero-shot Deep Reinforcement Learning Driving Policy Transfer for Autonomous Vehicles based on Robust Control\Some("http://export.arxiv.org/abs/1812.03216")\["t", "s_i", "s_f", "s", "levels"]\["diff-no"]\["fea", "pi"]\["ap", "tr", "j", "tt"]\\["Custom", "CS", "Formulas", "Pseudo", "Figures", "Tables", "Videos"]\["RL-RC"]\["N/A"]\0\1\\["Robotics", "Autonomous", "Driving", "Vehicle", "Simulation"]\["H", "TD"]\1\0\0\["USA"]\["University of California", "Berkeley DeepDrive"]\["Mechanical Engineering"]\0\["lib"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l"]\\\\\
2795341696\Reinforcement learning for non-prehensile manipulation: Transfer from simulation to physical system\Some("https://au.arxiv.org/pdf/1803.10371")\["t"]\["sim2real"]\["pi"]\["j"]\\["Custom", "CS", "Formulas", "Pseudo", "Figures", "Videos", "Tables"]\["NPG", "PG"]\["N/A"]\0\1\\["Robotics", "3D", "Arm", "Simulation", "RealWorld", "Push"]\["TD"]\1\0\0\["USA"]\["University of Washington", "Roboti LLC"]\["Industry", "Uni", "Colab"]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l"]\\\\\
2114580749\An Introduction to Intertask Transfer for Reinforcement Learning\Some("http://www.aaai.org/ojs/index.php/aimagazine/article/view/2329")\[]\["theory", "survey"]\[]\[]\\[]\[]\[]\2\2\\["theory", "survey"]\[]\1\0\0\["USA"]\["Lafayette College", "The University of Texas"]\["Computer Science"]\0\[]\["mag", "multi"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "s", "k"]\\\\\
2153353285\Using advice to transfer knowledge acquired in one reinforcement learning task to another\Some("https://doi.org/10.1007/11564096_40")\["p"]\["diff-it"]\["advisor"]\["ap", "tr"]\\["Custom", "CS", "Pseudo", "Formulas", "Figures"]\["SARSA"]\["sup"]\0\0\\["2D", "RoboCup", "KeepAway", "BreakAway", "MultiAgent", "Simulation", "3v2", "2v1"]\["TD"]\1\0\0\["USA"]\["University of Wisconsin", "University of Minnesota"]\[]\0\["h"]\["taylor", "lazaric", "zhu", "mag"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\["t", "r", "l", "k", "s"]\\[["BreakAway", "KeepAway"]]\\\
2124695578\Transfer learning for reinforcement learning on a physical robot\Some("https://www.researchgate.net/profile/Matthew_Taylor12/publication/228959234_Transfer_learning_for_reinforcement_learning_on_a_physical_robot/links/0fcfd51144740519c6000000.pdf?origin=publication_detail")\["t"]\["sim2real", "multi"]\["Q"]\["j", "tr", "ap"]\\["Custom", "CS", "Formulas", "Figures", "Web", "Dead"]\["Q", "SARSA"]\["sup"]\0\0\\["3D", "Robotics", "KeepAway", "RoboCup", "Simulation", "RealWorld", "MultiAgent"]\["TD"]\1\0\0\["USA"]\["The University of Texas", "University of Southern California"]\["Computer Science"]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\\\\
2004030284\Transfer of samples in batch reinforcement learning\Some("https://core.ac.uk/display/108628857")\["s_i", "s_f", "s"]\["multi"]\["I"]\["j", "tt"]\\["Custom", "CS", "Pseudo", "Formulas", "Figures", "Tables"]\["FQI"]\["N/A"]\0\0\\["2D", "Navigation", "Obstacle", "Samples", "Simulation"]\["TD", "batch"]\1\1\0\["Italy"]\["Politecnico di Milano"]\["IIT-Lab"]\0\["lib"]\["lazaric", "zhu", "bone", "mag", "cur"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\["Areas", "Obstacles"]
2110292307\Transferring instances for model-based reinforcement learning\Some("https://rd.springer.com/chapter/10.1007/978-3-540-87481-2_32")\["a", "v"]\["diff-it", "lit"]\["I"]\["ap", "tr", "j"]\\["Custom", "CS", "Pseudo", "Web", "Dead", "Figures", "Formulas", "Tables"]\["SARSA", "R-Max", "TIMBREL"]\["sup", "exp"]\0\0\\["2D", "3D", "2Dto3D", "MountainCar", "MountainCar3D", "Simulation", "ClassicControl"]\["MB", "all"]\1\0\0\["USA"]\["The University of Texas"]\["Computer Science"]\0\["h"]\["taylor", "lazaric", "taylor-intertask", "bone", "mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\[["MountainCar", "MountainCar3D"]]\\\
2154328025\Transfer via inter-task mappings in policy search reinforcement learning\Some("http://www.cs.utexas.edu/users/ai-lab/?taylor:ijcaams07")\["a", "v"]\["diff-it", "lit"]\["pi"]\["tt", "j", "tr"]\\["Custom", "CS", "Pseudo", "Formulas", "Figures", "Tables"]\["TVITM-PS"]\["sup"]\0\0\\["Simulation", "RoboCup", "KeepAway", "MultiAgent", "3v2", "4v3"]\["PS", "all"]\1\0\0\["USA"]\["The University of Texas"]\["Computer Science"]\0\["h"]\["taylor", "silva", "bone", "taylor-intertask", "lazaric", "mag", "cur"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\[["KeepAway3v2", "KeepAway4v3"]]\\\
2158150115\Autonomous transfer for reinforcement learning\Some("https://dl.acm.org/citation.cfm?id=1402427")\["a", "v", "t"]\["lit", "diff-no"]\["I", "Q"]\["j", "tt"]\\["Custom", "CS", "Formulas", "Pseudo", "Figures", "Tables", "Web", "Dead"]\["Q"]\["exp"]\1\0\\["2D", "3D", "2Dto3D", "MountainCar", "MountainCar3D", "Simulation", "Handbrake", "ClassicControl"]\["TD"]\1\0\0\["USA"]\["The University of Texas"]\["Computer Science"]\0\["h"]\["lazaric", "bone", "cur", "mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\[["MountainCar", "MountainCar3D"]]\\\
2913485808\Universal Successor Features for Transfer Reinforcement Learning\Some("http://arxiv.org/pdf/2001.04025.pdf")\["s_i", "s_f", "levels", "r"]\["multi"]\["fea"]\["tt"]\\["Custom", "CS", "Formulas", "Figures", "Pseudo", "Gym"]\["DQN", "DDPG", "USF"]\["N/A"]\0\1\\["2D", "Navigation", "Grid", "Reacher", "FetchReach", "Simulation"]\["TD"]\1\0\0\["USA", "Canada"]\["University of Alberta", "University of Montreal"]\["Computer Science"]\0\["all"]\["zhu", "mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\2\\
2784831250\Cross-Domain Transfer in Reinforcement Learning using Target Apprentice\Some("http://export.arxiv.org/pdf/1801.06920")\["v", "a", "t"]\["diff-no", "lit"]\["I", "advisor"]\["j", "tt"]\\["Custom", "CS", "Formulas", "Figures", "Pseudo", "Theorem"]\["FQI", "Q", "UMA", "TA"]\["Ma", "exp"]\1\0\\["2D", "Simulation", "CartPole", "Bicycle", "Navigation", "Grid", "ClassicControl"]\["TD", "batch"]\1\0\0\["USA"]\["University of Illinois"]\["Department of Agriculture and Biological Engineering and Coordinated Science Lab"]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k", "s"]\\[["CartPole", "Bicycle"]]\\\
2949876402\Grounding Language for Transfer in Deep Reinforcement Learning\Some("http://export.arxiv.org/abs/1708.00133")\["t", "s"]\["multi"]\["pi"]\["ap", "tr"]\\["OSS", "Custom", "Pseudo", "Formulas", "Figures", "Tables"]\["DQN", "AMN", "VIN"]\["sup", "text"]\0\1\\["GVGAI", "Simulation", "Games", "MonteCarlo"]\["TD"]\1\0\0\["USA"]\["Princeton University", "Massachusetts Institute of Technology"]\["Computer Science", "Artificial Intelligence Laboratory"]\0\["lib"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\\\\["Grid", "Obstacles"]
2132057084\Transfer in variable-reward hierarchical reinforcement learning\Some("https://link.springer.com/content/pdf/10.1007/s10994-008-5061-y.pdf")\["r"]\["multi"]\["pi"]\["tr"]\\["Custom", "CS", "Formulas", "Pseudo", "Figures", "Theorem"]\["MAXQ", "VRHRL"]\["N/A"]\0\0\\["RTS", "2D", "Grid", "Simulation"]\["H", "MB"]\1\0\0\["USA"]\["Oregon State"]\["Electrical Engineering and Computer Science"]\0\["lib"]\["taylor", "lazaric", "zhu", "mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l"]\\[]\\\
36691172\Using Homomorphisms to transfer options across continuous reinforcement learning domains\Some("https://dl.acm.org/citation.cfm?id=1597538.1597618")\["a", "v"]\["lit"]\["N/A"]\["ap", "j", "tr"]\\["Custom", "CS", "Pseudo", "Figures", "Formulas"]\["Q", "IntraOptionQ", "CMAC"]\["Ma", "svg", "exp"]\0\0\\["2D", "Navigation", "Blocksworld", "KeepAway", "RoboCup", "Simulation", "MultiAgent", "3v2", "4v3"]\["all", "H"]\1\0\0\["USA"]\["University of Michigan"]\["Computer Science and Engineering"]\0\["h"]\["taylor", "taylor-intertask", "lazaric", "cur", "mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k", "s"]\\[["Blocksworld3", "Blocksworld7"],["Blocksworld4", "Blocksworld7"],["Blocksworld5", "Blocksworld7"],["Blocksworld6", "Blocksworld7"], ["KeepAway3v2", "KeepAway4v3"]\\\["Blocksworld",]
2925000324\Reinforcement Learning Based Text Style Transfer without Parallel Training Corpus\Some("https://ui.adsabs.harvard.edu/abs/2019arXiv190310671G/abstract")\[]\["pure"]\[]\[]\\["Custom", "CS", "Formulas", "Figures", "Formulas", "Tables"]\[]\[]\0\1\\["NLP", "Simulation", "Text", "StyleTransfer"]\[]\1\0\0\["USA"]\["University of Illinois", "T. J. Watson Research Center"]\["Industry", "Uni", "Colab"]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l"]\\\\\
1848094219\Unsupervised cross-domain transfer in policy gradient reinforcement learning via manifold alignment\Some("https://dl.acm.org/citation.cfm?id=2886521.2886669")\["t", "a", "v"]\["diff-no", "lit"]\["I", "pi", "manifold"]\["j", "tt", "ap", "tr"]\\["Custom", "CS", "Formulas", "Figures", "Pseudo"]\["PG", "MAXDT-PG"]\["exp"]\1\0\\["2D", "CartPole", "SimpleMass", "CartPoleThreeLink", "Simulation", "ClassicControl"]\["TD"]\1\0\0\["USA"]\["University of Pennsylvania", "Olin College of Engineering", "Washington State University"]\[]\0\["h"]\["mag", "zhu"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\[["SimpleMass", "CartPole"], ["CartPole", "ThreeLinkCartPole"], ["CartPole", "Quadrotor"]]\\\
110451278\Reinforcement learning transfer via sparse coding\Some("https://cris.maastrichtuniversity.nl/portal/en/publications/reinforcement-learning-transfer-via-sparse-coding(55f157dc-7588-4af7-9d75-bf299bd3e19e).html")\["t", "a", "v"]\["diff-no", "lit"]\["Q", "I"]\["tt", "tr"]\\["Custom", "CS", "Formulas", "Figures", "Pseudo"]\["LSPI", "FQI"]\["exp"]\0\0\\["2D", "Simulation", "MountainCar", "InvertedPendulum", "CartPole", "ClassicControl"]\["TD", "batch"]\1\0\0\["Netherlands", "USA"]\["Maastricht University", "Lafayette College"]\[]\126751897\["h"]\["zhu", "mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\[["InvertedPendulum", "MountainCar"], ["MountainCar", "CartPole"]]\\1\
2098723043\Value-function-based transfer for reinforcement learning using structure mapping\Some("https://www.aaai.org/Library/AAAI/2006/aaai06-066.php")\["a", "v"]\["lit"]\["N/A"]\[]\\["Custom", "CS", "Pseudo", "Formulas", "Figures", "Tables"]\["QDBN"]\["exp"]\0\0\\["2D", "RoboCup", "BreakAway", "Simulation", "Games", "MultiAgent", "Qualitative Dynamic Bayes Network", "3v2", "4v3"]\["all"]\1\0\0\["USA"]\["The University of Texas"]\["Computer Science"]\0\["h"]\["taylor", "taylor-intertask", "mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\[]\\\
2281071090\Unsupervised energy prediction in a Smart Grid context using reinforcement cross-building transfer learning\Some("http://www.sciencedirect.com/science/article/pii/S0378778816300305")\["t"]\["same_all"]\["Q"]\["ap", "tr"]\\["Custom", "CS", "Pseudo", "Formulas", "Figures", "Tables"]\["SARSA", "DBN", "Q"]\["N/A"]\0\1\\["Deep Belief Network","Power", "Efficiency", "SmartGrid", "RBM", "Simulation"]\["TD"]\1\0\0\["Netherlands"]\["Technical University of Eindhoven"]\["Electrical Engineering"]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\[]\\\
2963983495\Theoretically-grounded policy advice from multiple teachers in reinforcement learning settings with applications to negative transfer\Some("https://dl.acm.org/citation.cfm?id=3060945")\[]\["same_all"]\["advisor"]\["j", "tt"]\\["Custom", "CS", "Formulas", "Figures", "Pseudo", "Theorem", "Lemma"]\["Q"]\["N/A"]\0\0\\["2D", "Navigation", "Simulation", "Grid", "CombinationLock", "Maze", "BlockDude"]\["TD"]\1\0\0\["USA"]\["Washington State University", "Princeton University"]\[]\0\[]\["mag", "silva"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\\\\
2235081654\Autonomous cross-domain knowledge transfer in lifelong policy gradient reinforcement learning\Some("http://www.seas.upenn.edu/~eeaton/papers/BouAmmar2015Autonomous.pdf")\["a", "p", "r", "t", "v"]\["lit", "multi"]\["sub", "fea", "pi"]\["j"]\\["Custom", "CS", "Formulas", "Theorem", "Figures"]\["Cross-Domain", "PG-Ella", "PG"]\["exp"]\1\0\\["Cross-Domain", "Lifelong", "PolicyGradient", "SimpleMass", "DoubleMass", "CartPole", "DoubleCartPole", "Bicycle", "Helicpoter", "Simulation", "ClassicControl"]\["TD"]\1\0\0\["USA"]\["University of Pennsylvania", "Olin College of Engineering"]\["Computer Science"]\0\["all"]\["mag"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\[["Many", "Many"]]\\\
2341634245\Neural signature of hierarchically structured expectations predicts clustering and transfer of rule sets in reinforcement learning.\Some("https://philpapers.org/rec/COLNSO")\[]\[]\[]\[]\\[]\[]\[]\2\2\\["Psychology"]\[]\1\0\0\["USA"]\["Brown University", "University of California Berkeley"]\["Cognitive, Linguistic and Psychological Science", "Psychology"]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
2141559023\An automated measure of MDP similarity for transfer in reinforcement learning\Some("http://www.seas.upenn.edu/~eeaton/papers/BouAmmar2014Automated.pdf")\["t"]\["multi"]\["Q"]\["j"]\\["Custom", "CS", "Pseudo", "Formulas", "Figures"]\["SARSA", "Q", "FQI", "LSPI"]\["N/A"]\0\1\\["2D", "Simulation", "CartPole", "InvertedPendulum", "MountainCar", "RBDist", "ClassicControl"]\["TD", "batch"]\1\0\0\["USA", "Netherlands", "UK"]\["University of Pennsylvania", "Washington State University", "Technical University of Eindhoven", "University of Maastricht", "University of Liverpool"]\[]\0\["lib", "h"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
2604618034\Knowledge Transfer for Deep Reinforcement Learning with Hierarchical Experience Replay.\Some("https://dr.ntu.edu.sg/handle/10220/42453")\["t", "s_i", "s_f", "levels", "s"]\["multi"]\["fea", "pi_dyn", "distil"]\["ap"]\\["Custom", "CS", "Formulas", "Figures", "Tables"]\["DQN", "AMN", "DIST"]\["N/A"]\0\1\\["2D", "Simulation", "Games", "VideoGames", "Atari"]\["TD", "H"]\1\0\0\["Singapore"]\["Nanyang Technological University"]\["Computer Science and Engineering"]\0\["lib"]\["mag", "zhu"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
2972430483\Transfer of Temporal Logic Formulas in Reinforcement Learning.\Some("https://arxiv.org/abs/1909.04256")\["s_i", "s_f", "levels"]\["multi"]\["Q"]\["tt", "tr", "ap"]\\["Custom", "CS", "Pseudo", "Formulas", "Figures"]\["Q"]\["N/A"]\0\0\\["2D", "Navigation", "Grid", "Timed", "Simulation"]\["TD"]\1\0\0\["USA"]\["The University of Texas"]\["Computational Engineering and Sciences", "Aerospace Engineering and Engineering Mechanics"]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
2586101872\An Evolutionary Transfer Reinforcement Learning Framework for Multiagent Systems\Some("https://doi.org/10.1109/TEVC.2017.2664665")\["s_i", "s_f", "levels"]\["multi"]\["Q", "advisor"]\["ap"]\\["Custom", "CS", "Pseudo", "Formulas", "Figures", "Tables"]\["Q", "FALCON", "BP"]\["N/A"]\0\1\\["2D", "3D", "Simulation", "Games", "Navigation", "Minefield", "MinefieldNavigation", "UnrealTournament", "MultiAgent"]\["TD"]\1\1\1\["Singapore", "China", "USA"]\["Nanyang Technological University", "Chongqing University", "University of Louisville"]\["Computer Science and Engineering", "Electrical and Computer Engineering"]\2778259398\["all"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\3\
2807652507\Context-Aware Indoor VLC/RF Heterogeneous Network Selection: Reinforcement Learning With Knowledge Transfer\Some("https://dblp.uni-trier.de/db/journals/access/access6.html#DuWSW18")\["t"]\["same_all"]\["Q"]\["tt", "tr"]\\["Custom", "CS", "Pseudo", "Formulas", "Figures"]\["Q" ]\["exp"]\0\0\\["Indoor Network Selection", "Network", "Contextaware", "Simulation", "MonteCarlo"]\["TD"]\1\0\0\["China"]\["National University of Defense Technology", "National Digital Switching System Engineering and Technological Research Center"]\["Information and Communcation"]\0\["lib"]\["mag"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\[]\\\
2259258048\Multiagent Reinforcement Learning With Sparse Interactions by Negotiation and Knowledge Transfer\Some("https://www.ncbi.nlm.nih.gov/pubmed/27046917")\["s", "levels"]\["multi"]\["Q"]\["tt", "tr"]\\["Custom", "CS", "Formulas", "Figures", "Pseudo", "Tables"]\["Q", "CQ", "NegoQV", "ILVFT", "NegoSI"]\["N/A"]\0\0\\["2D", "Simulation", "Navigation", "Grid", "Warehouse", "MultiAgent"]\["TD"]\1\0\0\["China", "USA"]\["Nanjing University", "University of Michigan"]\["Control and Systems Engineering", "State Key Laboratory for Novel Software Technology"]\0\["h"]\["silva", "mag"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
1533597678\Transfer Learning in Reinforcement Learning Problems Through Partial Policy Recycling\Some("https://doi.org/10.1007/978-3-540-74958-5_70")\["#"]\["diff-no"]\["Q"]\["ap", "j", "tt", "tr"]\\["Custom", "CS", "Figures"]\["TgR"]\["N/A"]\0\0\\["Bongard", "Blocksworld", "TicTacToe", "2D", "Navigation", "BoardGames", "Simulation", "Games"]\["RRL"]\1\0\0\["Netherlands"]\["K.U.Leuven"]\["Computer Science"]\0\[]\["taylor", "bone", "mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\[]\\\["Blocksworld"]
2156493855\Relational macros for transfer in reinforcement learning\Some("https://experts.umn.edu/en/publications/relational-macros-for-transfer-in-reinforcement-learning")\["a", "r", "v"]\["diff-it"]\["relational-macros"]\["j", "tr"]\\["Custom", "CS", "Pseudo", "Figures", "Tables"]\["Q", "RMT-D"]\["sup"]\0\0\\["2D", "RoboCup", "BreakAway", "Simulation", "Games", "MultiAgent", "2v1", "3v2", "4v3"]\["TD", "RRL"]\1\0\0\["USA"]\["University of Minnesota", "University of Wisconsin"]\["Computer Science"]\0\["h"]\["taylor", "taylor-intertask", "bone", "mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k", "s"]\\[]\\\
2953052971\Source-Critical Reinforcement Learning for Transferring Spoken Language Understanding to a New Language\Some("https://export.arxiv.org/pdf/1808.06167")\[]\["pure"]\[]\[]\\["Custom", "CS", "Formulas", "Figures"]\["REINFORCE"]\[]\0\1\\["Translation", "Simulation", "NLP"]\["TD", "PS"]\1\0\0\["China", "US"]\["National Laboratory of Pattern Recognition", "University of Chinese Academy of Sciences", "Mobvoi AI Lab"]\["Institute of Automation", "Industry"]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l"]\\\\\
2766253973\Heuristically Accelerated Reinforcement Learning by Means of Case-Based Reasoning and Transfer Learning\Some("https://researchnow.flinders.edu.au/en/publications/heuristically-accelerated-reinforcement-learning-by-means-of-case")\["t"]\["sim2real"]\["pi", "Q", "cases"]\["j", "tt"]\\["Custom", "CS", "Formulas", "Figures", "Pseudo"]\["Q", "HAQL", "TLHAQL"]\["N/A"]\0\0\\["Robotics", "Simulation", "RealWorld", "RoboCup", "Kilobot", "Acrobot", "MultiAgent", "ClassicControl"]\["CBR", "TD"]\1\0\0\["Brazil", "Spain"]\["Technological Institute of Aeronautics", "Spanish National Research Council", "Centro Universitário da FEI"]\["Artificial Intelligence Research Institute"]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k", "s"]\\[["RoboCup", "Kilobot"], ["Acrobot", "Humanoid"]]\\\
2960705509\Assessing Transferability from Simulation to Reality for Reinforcement Learning\Some("https://export.arxiv.org/pdf/1907.04685")\["s_i", "s_f", "t"]\["sim2real", "multi"]\["pi"]\["ap"]\\["Custom", "CS", "Formulas", "Figures", "Pseudo", "Tables", "Videos"]\["SPOTA", "EPOpt", "PPO"]\["N/A"]\0\1\\["3D", "Robotics", "BallBalancer", "CartPole", "SimulationOptimizationBias", "Simulation", "RealWorld", "ClassicControl"]\["TD", "PS"]\1\0\0\["Germany"]\["Technische Universität Darmstadt", "Honda Research Institute Europe", "Max Planck Institute"]\["Intelligent Autonomous Systems Group", "Industry"]\0\["h"]\["mag"]\["t", "r", "l"]\["t"]\["t", "r", "l"]\\\\\
2785494456\Meta-Learning Transferable Active Learning Policies by Deep Reinforcement Learning\Some("https://export.arxiv.org/pdf/1806.04798")\[]\["pure"]\["DNN"]\[]\\["Custom", "CS", "Figures", "Formulas", "Tables"]\["SingleRL", "MLP-GAL", "REINFORCE"]\["N/A"]\0\1\\["Meta", "Simulation", "Datasets", "UCI"]\["TD", "PS"]\1\0\0\["UK", "Japan"]\["University of Edinburgh", "University College London", "Nara Institute of Technology"]\["Institute of Perception, Action and Behaviour", "Statistics"]\0\[]\["mag"]\["t", "r", "l"]\["r", "l"]\["t", "r", "l"]\\\2\\
2534140487\Using Transfer Learning to Speed-Up Reinforcement Learning: A Cased-Based Approach\Some("https://www.iiia.csic.es/es/publications/using-transfer-learning-speed-reinforcement-learning-case-based-approach")\["v", "t"]\["diff-it"]\["cases", "pi", "Q"]\["j", "tt"]\\["Custom", "CS", "Formulas", "Figures", "Pseudo", "Tables"]\["Q", "HAQL", "TL-HAQL"]\["sup"]\0\1\\["2D", "3D", "2Dto3D", "MountainCar", "Simulation", "ClassicControl"]\["CBR", "TD"]\1\0\0\["Brazil", "Spain"]\["Technological Institute of Aeronautics", "Spanish National Research Council", "Centro Universitário da FEI"]\["Artificial Intelligence Research Institute"]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\[["MountainCar", "MountainCar3D"]]\\\
2997770298\Correction: Guidance for Closed-Loop Transfers using Reinforcement Learning with Application to Libration Point Orbits\Some("http://dx.doi.org/10.2514/6.2020-0458.c1")\[]\["pure"]\[]\[]\\["Custom", "CS", "PPO-Pat-Coady", "Formulas", "Figures", "TensorFlow", "Tables"]\["PPO"]\[]\0\1\\["Simulation", "Space", "ClosedLoop", "SpaceCraft"]\["TD"]\1\0\0\["USA"]\["Purdue University", "Massachusetts Institute of Technology"]\[]\0\[]\["mag"]\["t", "r", "l"]\["r", "l"]\["t", "r", "l", "k"]\\\\\
2036103676\Accelerating Multiagent Reinforcement Learning by Equilibrium Transfer\Some("http://or.nsfc.gov.cn/handle/00001903-5/480512")\["r", "s", "#", "v"]\["multi", "multiagent"]\["equilibrium"]\["j", "tr", "tt"]\\["Custom", "CS", "Pseudo", "Formulas", "Theorem", "Lemma", "Figures", "Tables"]\["NashQ", "CEQ"]\[]\0\0\\["2D", "Grid", "Navigation", "Simulation", "SoccerGame", "WallGame", "MultiGrid", "Equilibrium", "MultiAgent"]\["TD"]\1\0\0\["China", "Singapore"]\["Nanjing University", "Nanyang Technological University"]\["Computer Science", "Computer Engineering"]\0\[]\["mag", "silva"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\[]\\\["Grid", "WallGame", "Soccergame"]
2907543530\Deep Reinforcement Learning with Knowledge Transfer for Online Rides Order Dispatching\Some("https://dblp.uni-trier.de/db/conf/icdm/icdm2018.html#WangQTYZ18")\["t", "s_i", "s_f", "levels"]\["multi"]\["fea"]\["j", "tr", "tt", "ap"]\\["Custom", "CS", "Formulas", "Figures", "Pseudo"]\["DQN"]\["N/A"]\0\1\\["Simulation", "RideSharing", "Dispatch", "MultiAgent"]\["TD"]\1\1\1\["USA", "China"]\["Washington State University", "DiDi Research America", "DiDi Chuxing"]\["Industry", "Uni", "Colab"]\0\["h"]\["mag"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
2014512216\Transfer of Experience Between Reinforcement Learning Environments with Progressive Difficulty\Some("https://www.researchgate.net/profile/Michael_Madden3/publication/220637641_Transfer_of_Experience_Between_Reinforcement_Learning_Environments_with_Progressive_Difficulty/links/00b7d5249d3c37fa43000000.pdf?disableCoverPage=true")\["s_i", "s_f", "t", "levels"]\["multi"]\["rule"]\["tr", "tt"]\\["Custom", "CS", "Figures", "Formulas", "Tables"]\["Q", "P-RL"]\["N/A"]\0\0\\["2D", "Grid", "Navigation", "Simulation"]\["TD"]\1\0\0\["Ireland"]\["National University of Ireland"]\["Information Technology"]\0\["all"]\["taylor", "silva", "bone", "lazaric", "mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l"]\\[]\\\["Grid", "Maze"]
2742143911\Advantages and Limitations of using Successor Features for Transfer in Reinforcement Learning\Some("http://export.arxiv.org/pdf/1708.00102")\["s_i", "s_f", "r"]\["multi"]\["fea", "successor"]\["j", "tt", "ap"]\\["Custom", "CS", "Pseudo", "Figures", "Formulas", "Tables", "TensorFlow"]\["FQI", "FSF"]\["N/A"]\0\1\\["Simulation", "2D", "Navigation", "Grid", "Successor"]\["TD", "batch"]\1\0\0\["USA"]\["Brown University"]\[]\0\["all"]\["zhu", "mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\2\\
2963049105\Online Transfer Learning in Reinforcement Learning Domains.\Some("http://irll.eecs.wsu.edu/wp-content/papercite-data/pdf/2015sdmia-zhan.pdf")\["s"]\["same_all"]\["advisor"]\["tr"]\\["CS", "Custom", "Theorem", "Figures", "Tables", "Formulas", "Lemma"]\["Q", "SARSA"]\["N/A"]\0\0\\["Simulation", "Games", "Pacman", "Atari"]\["TD"]\1\0\0\["USA"]\["Washington State University"]\["Electrical Engineering and Computer Science"]\0\["h"]\["zhu", "mag"]\["t", "r", "l"]\["t", "l"]\["t", "r", "l", "k"]\\\\\
2985293964\Transfer in Deep Reinforcement Learning Using Knowledge Graphs.\Some("https://dblp.uni-trier.de/db/journals/corr/corr1908.html#abs-1908-06556")\["s", "levels"]\["multi", "lit"]\["graph", "pi"]\["tt"]\\["CS", "Custom", "KG-DQN", "PyTorch", "Jericho", "TextWorld", "Formulas", "Figures", "Tables"]\["DQN", "KG-DQN"]\["N/A"]\0\1\\["Simulation", "TextBased", "Game"]\["TD"]\1\0\0\["USA"]\["Georgia Institute of Technology"]\["Interactive Computing"]\0\["h"]\["mag"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\["t", "r", "l", "k", "s"]\\\\\
2771643225\Tuning the molecular weight distribution from atom transfer radical polymerization using deep reinforcement learning\Some("http://export.arxiv.org/abs/1712.04516")\[]\["pure"]\[]\[]\\["Custom", "CS", "Formulas", "Figures", "Tables"]\["A3C"]\["N/A"]\0\1\\["Chemistry", "Simulation", "CNN"]\["TD"]\1\0\0\["USA"]\["Carnegie Mellon University"]\["Chemistry,", "Computer Science"]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l"]\\\\\
2949472713\Transfer and Online Reinforcement Learning in STT-MRAM Based Embedded Systems for Autonomous Drones\Some("https://arxiv.org/abs/1905.06314")\["s_i", "s_f", "levels"]\["multi"]\["fea", "pi"]\["ap"]\\["Custom", "CS", "Pseudo", "Figures", "Formulas", "TensorFlow", "AirSim", "Videos", "LinkNotIncluded"]\["DDQN", "DNN"]\["N/A"]\0\1\\["RealTime", "Simulation", "Drone", "ObjectAvoidance", "CNN", "STT-MRAM", "Robotics", "Vehicle", "Aerial"]\["TD"]\1\0\0\["USA"]\["Georgia Institute of Technology", "Samsung semiconductor"]\["Advanced Logic Lab"]\2966501669\["all"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l"]\\\\2\
2145729962\Empirical Evidence of Priming Transfer Reinforcement and Learning in the Real and Virtual Trillium Trails\Some("https://doi.org/10.1109/TLT.2010.20")\[]\[]\[]\[]\\[]\[]\[]\2\2\\["Psychology"]\[]\1\0\0\["USA"]\["University of Central Florida"]\["Psychology"]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k", "s"]\\\\\
2782656435\Sample-Efficient Reinforcement Learning through Transfer and Architectural Priors.\Some("http://export.arxiv.org/pdf/1801.02268")\["s_i", "s_f", "levels"]\["multi"]\["pri", "fea"]\["tr"]\\["Custom", "CS", "TensorFlow", "Keras", "Gym", "Figures", "Formulas"]\["DQN", "DDQN"]\["N/A"]\0\1\\["2D", "Simulation", "Navigation", "Grid", "Pickup"]\["TD"]\1\0\0\["USA"]\["Cornell University"]\["SE(3) Group"]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k", "s"]\\\2\\
2953431737\End-to-end nonprehensile rearrangement with deep reinforcement learning and simulation-to-reality transfer\Some("https://dblp.uni-trier.de/db/journals/ras/ras119.html#YuanHKWS19")\["t", "s_i", "s_f", "#"]\["sim2real", "multi"]\["pi", "fea"]\["ap"]\\["Custom", "CS", "Pseudo", "Figures", "Formulas", "Tables"]\["DQN", "Q"]\["N/A"]\0\1\\["Simulation", "RealWorld", "Robotics", "Arm", "Objects", "Obstacle"]\["TD"]\1\1\1\["China", "USA", "Sweden"]\["Hong Kong University of Science and Technology", "Yale University", "KTH Royal Institute of Technology", "Örebro University"]\["Robotics Institute", "Mechanical Engineering and Material Science", "Centre for Autonomous System", "Centre for Applied Autonomous Sensor Systems"]\0\["all"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k", "s"]\\\\\
2138497321\Representation Transfer for Reinforcement Learning\Some("http://www.cs.utexas.edu/users/ai-lab/?AAAI07-Symposium-Taylor")\["a", "v"]\["diff-it"]\["Q"]\["tt"]\\["Custom", "CS", "Pseudo", "Formulas", "Figures"]\["SARSA"]\["sup"]\0\0\\["Simulation", "RoboCup", "KeepAway", "MultiAgent", "3v2", "4v3"]\["TD", "PS"]\1\0\0\["USA"]\["The University of Texas"]\["Computer Science"]\0\["h"]\["taylor", "mag", "lazaric", "zhu", "bone"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\[["KeepAway3v2", "KeepAway4v3"]]\\\
2289068127\Don't think just feel the music: Individuals with strong pavlovian-to-instrumental transfer effects rely less on model-based reinforcement learning\Some("http://pesquisa.bvsalud.org/portal/resource/pt/mdl-26942321")\[]\["pure"]\[]\[]\\[]\[]\[]\2\2\\["Psychology"]\[]\1\0\0\["USA"]\[]\["Psychology"]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
2963913081\Transfer Learning for Related Reinforcement Learning Tasks via Image-to-Image Translation\Some("http://proceedings.mlr.press/v97/gamrian19a.html")\["s", "t", "s_i", "s_f", "levels"]\["diff-no"]\["GAN", "imitation"]\["ap", "tt"]\\["OSS", "Custom", "PyTorch", "Gym", "Pseudo", "Formulas", "Figures", "Tables", "Videos"]\["A2C", "A3C"]\["N/A"]\0\1\\["2D", "Games", "VideoGames", "Atari", "Breakout", "Roadfighter", "Simulation"]\["TD"]\1\0\0\["Israel"]\["Bar-Ilan University", "Allen Institute for Artificial Intelligence"]\["Computer Science", "Industry"]\0\["lib"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\\\\
1457482454\Transfer learning in multi-agent reinforcement learning domains\Some("https://doi.org/10.1007/978-3-642-29946-9_25")\["#", "s_i", "s_f"]\["multi"]\["Q", "pi"]\["j", "ap"]\\["Custom", "CS", "Pseudo", "Figures", "Formulas", "Tables"]\["Q", "BITER"]\["Ma", "exp"]\0\0\\["2D", "Navigation", "Grid", "Simulation", "PredatorPrey", "MultiAgent"]\["TD"]\1\0\0\["Greece"]\["Aristotle University"]\["Informatics"]\0\["h"]\["silva", "mag", "multi"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\["PredatorPrey"]
2739691807\Deep Transfer in Reinforcement Learning by Language Grounding.\Some("http://export.arxiv.org/abs/1708.00133")\["t", "s"]\["multi"]\["pi"]\["ap", "tr"]\\["OSS", "Custom", "Pseudo", "Formulas", "Figures", "Tables"]\["DQN", "AMN", "VIN"]\["sup", "text"]\0\1\\["GVGAI", "Simulation", "Games", "CNN", "NLP", "MonteCarlo"]\["TD", "MB"]\1\0\0\["USA"]\["Princeton University", "Massachusetts Institute of Technology"]\["Computer Science", "Artificial Intelligence Laboratory"]\2949876402\["lib"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\\\4\
3001528895\Single Episode Policy Transfer in Reinforcement Learning.\Some("http://arxiv.org/pdf/1910.07719.pdf")\["t"]\["multi"]\["VAE", "latentspace"]\["tt", "tr"]\\["OSS", "Custom", "Pseudo", "Formulas", "Figures"]\["DDQN"]\["N/A"]\0\1\\["Simulation", "2D", "Navigation", "Acrobot", "HIV", "ClassicControl"]\["TD", "MB"]\1\0\0\["USA"]\["Georgia Institute of Technology", "Lawrence Livermore National Laboratory"]\[]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k", "s"]\\\\\
2887671224\Policy and Value Transfer in Lifelong Reinforcement Learning\Some("http://proceedings.mlr.press/v80/abel18b.html")\["v", "s_i", "s_f", "levels"]\["multi"]\["Q"]\["j", "tr", "tt", "ap"]\\["OSS", "Custom", "Pseudo", "Formulas", "Figures", "Theorem", "Tables"]\["Q", "Delayed-Q"]\["N/A"]\0\0\\["2D", "Navigation", "Grid", "Simulation"]\["TD", "MB"]\1\0\0\["USA"]\["Brown University"]\["Computer Science"]\0\["all"]\["mag"]\["t", "r", "l"]\["t", "l"]\["t", "r", "l", "k", "s"]\\\\\
1490954610\Learning relational options for inductive transfer in relational reinforcement learning\Some("https://rd.springer.com/chapter/10.1007/978-3-540-78469-2_12")\["#"]\["diff-no"]\["options"]\["ap", "j", "tr"]\\["Custom", "CS", "Formulas", "Figures"]\["Q", "TILDE"]\["N/A"]\0\0\\["2D", "Navigation", "TicTacToe", "BoardGames", "Blocksworld", "Grid", "Simulation", "Games"]\["RRL", "H"]\1\1\1\["Netherlands"]\["K.U.Leuven"]\["Computer Science"]\0\["h"]\["taylor", "mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\[]\\\["Blocksworld"]
2979341666\Autonomous Navigation via Deep Reinforcement Learning for Resource Constraint Edge Nodes using Transfer Learning\Some("https://arxiv.org/abs/1910.05547")\["t", "s_i", "s_f", "levels"]\["sim2real", "multi"]\["pi", "offline"]\["j", "tr", "ap", "tt"]\\["OSS", "TensorFlow", "AirSim", "Figures", "Tables", "Formulas", "Pseudo", "Videos"]\["DQN"]\["N/A"]\0\1\\["3D", "Navigation", "Simulation", "Drone", "RealWorld", "CNN", "Aerial"]\["TD"]\1\0\0\["USA"]\["Georgia Institute of Technology"]\["Electrical and Computer Engineering"]\0\["lib"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k", "s"]\\\\\
2921955147\A Survey on Transfer Learning for Multiagent Reinforcement Learning Systems\Some("https://doi.org/10.1613/jair.1.11396")\[]\["theory", "survey"]\[]\[]\\[]\[]\[]\2\1\\["theory", "survey", "multiagent"]\[]\1\0\0\["Brazil"]\["University of Sao Paulo"]\["Computer Engineering"]\0\[]\["mag", "zhu", "multi"]\["t", "r", "l"]\["r", "l", "k"]\["t", "r", "l", "k", "s"]\\\\\
2181867278\Automatically Mapped Transfer between Reinforcement Learning Tasks via Three-Way Restricted Boltzmann Machines\Some("https://research.tue.nl/en/publications/automatically-mapped-transfer-between-reinforcement-learning-task-2")\["a", "t", "r"]\["lit"]\["pi"]\["j", "tt"]\\["Custom", "CS", "Formulas", "Pseudo", "Figures"]\["RBM"]\["exp"]\1\1\\["Simulation", "MountainCar", "CartPole", "RBM", "ClassicControl"]\["TD"]\1\0\0\["Netherlands", "USA"]\["Maastricht University", "Washington State University"]\["Knowledge Engineering", "Electrical Engineering and Computer Science"]\0\["all"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\[["Inverted Pendulum", "CartPole"], ["MountainCar", "CartPole"]]\\\
2313922182\Towards transferring skills to flexible surgical robots with programming by demonstration and reinforcement learning\Some("https://ieeexplore.ieee.org/document/7449855/")\["t"]\["same_all", "sim2real"]\["pi"]\[]\\["Custom", "CS", "Figures", "Formulas"]\["Q", "PoWER"]\["N/A"]\0\0\\["Simulation", "Robotics", "Arm", "Surgery"]\["TD"]\1\1\1\["China", "Singapore"]\["University of Hong Kong", "National University of Singapore"]\["Industrial Manufacturing Systems Engineering", "Biomedical Engineering"]\0\[]\["mag"]\["t", "r", "l", "s"]\["r", "l"]\["r", "l"]\\\\\
2944895663\TibGM: A Transferable and Information-Based Graphical Model Approach for Reinforcement Learning\Some("http://eprints.gla.ac.uk/212298/")\["t", "a"]\["same_all"]\["model", "latent"]\["ap", "tr"]\\["Custom", "CS", "Formulas", "Gym", "rllab", "SAC", "Figures", "Tables", "github/haarnoja/sac"]\["TibGM", "DDPG", "LSP", "SAC", "PPO", "ERL", "DIAYN", "VIREL", "GEP-PG", "ProMP"]\["N/A"]\0\1\\["Simulation", "Robotics", "Mujoco", "Swimmer", "Hopper", "Walker2d", "HalfCheetah", "Ant", "Humanoid", "MountainCar", "ClassicControl"]\["TD", "H"]\1\0\0\["UK"]\["University of Cambridge", "Alan Turing Institute"]\["Engineering"]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l"]\\\\\
2735260544\Automatic Discovery and Transfer of Task Hierarchies in Reinforcement Learning\Some("https://dblp.uni-trier.de/db/journals/aim/aim32.html#MehtaRTD11")\["v", "a"]\["diff-no"]\["model", "hierarchy", "sub"]\["j", "tt"]\\["Custom", "CS", "Pseudo", "Figures"]\["Q", "HI-MAT"]\["N/A"]\0\0\\["Simulation", "2D", "Grid", "Navigation", "Resources", "Mining", "Wargus", "Taxi", "MultiAgent"]\["TD", "H", "B"]\1\0\0\["USA"]\["Oregon State University", "Case Western Reserve University"]\[]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "l", "k"]\["t", "r", "l", "k", "s"]\\\\\
2517639096\Graph based skill acquisition and transfer Learning for continuous reinforcement learning domains\Some("http://www.sciencedirect.com/science/article/pii/S0167865516302112")\["s_i", "s_f", "s", "levels"]\["same_all"]\["Q", "graph"]\["tt", "ap", "tr"]\\["Custom", "CS", "Formulas", "Pseudo", "Figures", "Tables"]\["GSL", "SARSA"]\["N/A"]\0\0\\["Simulation", "2D", "Pinball", "GraphBased"]\["TD", "H"]\1\1\1\["Iran"]\["University of Tehran"]\["Electrical and Computer Engineering"]\0\["all"]\["mag"]\["t", "r", "l", "s"]\["t", "r", "l", "s", "k"]\["t", "r", "l", "s", "k"]\\\\\
2966501669\Hierarchical Memory System With STT-MRAM and SRAM to Support Transfer and Real-Time Reinforcement Learning in Autonomous Drones\Some("https://ui.adsabs.harvard.edu/abs/2019IJEST...9..485Y/abstract")\["s_i", "s_f", "levels"]\["multi"]\["fea", "pi"]\["ap"]\\["Custom", "CS", "Pseudo", "Figures", "TensorFlow", "AirSim", "Tables"]\["DDQN", "DNN"]\["N/A"]\0\1\\["RealTime", "Simulation", "Drone", "ObjectAvoidance", "CNN", "STT-MRAM", "Robotics", "Vehicle", "Aerial", "RealWorld"]\["TD"]\1\1\1\["USA"]\["Georgia Institute of Technology", "IBM T.J. Watson research center", "Samsung Semiconductor"]\["Industry", "Uni", "Colab"]\0\["all"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\\\\
2619275098\Personalizing a Dialogue System with Transfer Reinforcement Learning\Some("https://www.arxiv.org/abs/1610.02891")\["t", "#"]\["same_all"]\["Q"]\["ap", "tr"]\\["Custom", "CS", "Formulas", "Pseudo", "Figures", "Tables"]\["Q"]\["lib"]\0\0\\["Simulation", "Dialogue", "Personalized", "PETAL", "CoffeeShopping", "NLP"]\["TD"]\1\0\0\["China"]\["Hong Kong University"]\["Science and Technology"]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\2\\
193076044\Using cases as heuristics in reinforcement learning: a transfer learning application\Some("http://digital.csic.es/bitstream/10261/60868/1/IJCAI%202011%20%281211-1217%29.pdf")\["v", "t", "s", "r"]\["diff-it"]\["cases", "pi", "Q"]\["ap"]\\["Custom", "CS", "Formulas", "Figures", "Pseudo", "Tables"]\["Q", "HAQL", "CB-HAQL", "L3"]\["sup", "exp"]\0\0\\["2D", "3D", "2Dto3D", "Robotics", "Simulation", "Acrobot", "RoboCup3D", "ClassicControl"]\["CBR", "TD"]\1\0\0\["Brazil", "Spain"]\["Technological Institute of Aeronautics", " Universitat Autonoma de Barcelona"]\["Electrical Engineering", "Artificial Intelligence Research Institute"]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\[["Acrobot", "RoboCup3D"]]\\\
2925306934\How to pick the domain randomization parameters for sim-to-real transfer of reinforcement learning policies?\Some("http://ui.adsabs.harvard.edu/abs/2019arXiv190311774V/abstract")\["t"]\["same_all"]\["pi"]\[]\\["Formulas", "OSS", "Docker", "Gym", "Mujoco", "PPO", "Custom", "PyTorch"]\["PPO"]\["N/A"]\0\1\\["2D", "Robotics", "Simulation", "Dart", "Mujoco", "Hopper", "Walker"]\["TD"]\1\0\0\["USA"]\["University of California"]\["Computer Science and Engineering"]\0\[]\["mag"]\["t", "r", "l"]\["r", "l"]\["t", "r", "l", "k"]\\\2\\
2897005774\Deep Transfer Reinforcement Learning for Text Summarization\Some("https://arxiv.org/abs/1810.06667")\["s", "r", "datasets"]\["multi"]\["pi", "pi_dyn"]\["ap"]\\["Custom", "CS", "Figures", "Formulas", "ROGUE", "Tables"]\["PG", "SelfCriticPolicyGradient"]\["N/A"]\0\1\\["Text", "Summarization", "ROGUE", "NLP", "Simulation", "CNN"]\["TD"]\1\0\0\["USA"]\["Virginia Tech"]\["Discovery Analytics Center"]\0\["all"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
2605369401\Transfer Reinforcement Learning with Shared Dynamics\Some("https://hal.archives-ouvertes.fr/hal-01548649")\["t" ,"s_i", "s_f", "levels"]\["multi"]\["r"]\["j", "tt", "ap", "tr"]\\["Custom", "CS", "Formulas", "Pseudo", "Figures"]\["Q"]\["N/A"]\0\0\\["2D", "Navigation", "Grid", "Simulation"]\["TD"]\1\0\0\["France", "Canada"]\["Orange Labs at Châtillon", "Maluuba at Montréal"]\["Industry"]\0\["lib"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
2998384510\Exploration of Long Time-of-Flight Three-Body Transfers Using Deep Reinforcement Learning\Some("http://dx.doi.org/10.2514/6.2020-0460")\[]\["pure"]\[]\[]\\["Custom", "CS", "Formulas", "Pseudo", "Figures", "Tables"]\["DDPG"]\[]\2\1\\["Robotics", "Fuel", "Transfer", "Simulation"]\["TD"]\1\1\1\["Japan"]\["The University of Tokyo", "Japan Aerospace Exploration Agency"]\["Industry", "Uni", "Colab"]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\\\\
3010212570\Exploiting Multi-Task Learning to Achieve Effective Transfer Deep Reinforcement Learning in Elastic Optical Networks\Some("https://www.osapublishing.org/abstract.cfm?uri=OFC-2020-M1B.3")\["#"]\["multi"]\["pi", "fea"]\["j", "tt", "tr", "ap"]\\["Custom", "CS", "Formulas", "Figures", "Tables"]\["A3C"]\["N/A"]\0\1\\["Elastic", "Optical", "Network", "Simulation"]\["TD"]\1\0\0\["USA", "China"]\["University of California", "University of Science and Technology of China"]\[]\0\["all"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
2575472443\Transfer learning for multiagent reinforcement learning systems\Some("https://dl.acm.org/citation.cfm?id=3061181")\[]\["theory"]\[]\[]\\[]\[]\[]\2\2\\["Theory", "MultiAgent"]\[]\1\0\0\["Brazil"]\["Universidade de São Paulo"]\["Escola Politécnica"]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
3010403426\A Reinforcement Learning Based Network Scheduler for Deadline-Driven Data Transfers\Some("https://dblp.uni-trier.de/db/conf/globecom/globecom2019.html#GhosalSSTW19")\[]\["pure"]\[]\[]\\["Custom", "CS", "Formulas", "Figures", "Tables"]\["Q"]\[]\2\0\\["Network", "DataTransfer", "Deadline", "Scheduling", "Simulation"]\["TD"]\1\0\0\["USA"]\["University of California", "Lawrence Berkley National Laboratory"]\["Computer Science"]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\\\\
2403765497\Directed Exploration in Reinforcement Learning with Transferred Knowledge\Some("http://proceedings.mlr.press/v24/mann12a.html")\["s", "s_i", "s_f", "a"]\["multi"]\["Q"]\["j", "tt"]\\["Custom", "CS", "Theorem", "Formulas", "Lemma", "Figures"]\["DelayedQ"]\["Ma"]\0\0\\["Robotics", "Kinematic", "Arm", "Simulation"]\["TD"]\1\0\0\["USA"]\["Texas A&M University"]\["Computer Science and Engineering"]\0\[]\["mag"]\["t", "r", "l", "k"]\["t", "r", "l"]\["t", "r", "l", "k"]\\[]\\\
2049423956\Transfer learning with Partially Constrained Models: Application to reinforcement learning of linked multicomponent robot system control\Some("https://dblp.uni-trier.de/db/journals/ras/ras61.html#Fernandez-GaunaLG13")\["v", "t", "r"]\["multi"]\["Q"]\["tt", "ap"]\\["Custom", "CS", "Formulas", "Figures", "Videos"]\["Q"]\["Ma"]\0\0\\["Robotics", "Simulation", "Hose", "PCM", "2D", "Navigation"]\["TD"]\1\1\1\["Spain"]\["University of the Basque Country"]\["Computational Intelligence Group"]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\\\\
2533806771\A Reinforcement Learning Architecture That Transfers Knowledge Between Skills When Solving Multiple Tasks\Some("http://ieeexplore.ieee.org/document/7592409/")\["t", "s", "s_i", "s_f"]\["multi"]\["advisor"]\["tt"]\\["iCub", "Custom", "CS", "Formulas", "Figures", "Tables"]\["TERL"]\["exp"]\1\1\\["Simulation", "2D", "Robotics", "Arm", "3D", "iCub"]\["TD", "H"]\1\0\0\["Singapore", "Italy"]\["Nanyang Technological University", "Laboratory of Computational Embodied Neuroscience"]\["Robotics Research Centre"]\0\["all"]\["mag"]\["t", "r", "l", "k", "s"]\["t", "r", "l", "k", "s"]\["t", "r", "l", "k", "s"]\\\\\
2951871955\Transfer in Deep Reinforcement Learning Using Successor Features and Generalised Policy Improvement\Some("http://export.arxiv.org/pdf/1901.10964")\["s"]\["multi"]\["fea"]\["j", "tt", "tr", "ap"]\\["Custom", "CS", "Formulas", "Figures", "Pseudo", "Tables", "Videos"]\["Q", "SF+GPI-Q", "DQ"]\["N/A"]\0\1\\["3D", "Navigation", "Simulation", "CNN", "Successor", "Collect"]\["TD"]\1\0\0\["USA"]\["DeepMind"]\["Industry"]\0\["lib"]\["mag", "zhu"]\["t", "r", "l"]\["t", "r", "l", "s"]\["t", "r", "l", "s", "k"]\\\\\
1582256513\Task similarity measures for transfer in reinforcement learning task libraries\Some("http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1555955")\["s_i", "s_f", "levels"]\["multi"]\["fea"]\["ap"]\\["Custom", "CS", "Figures", "Formulas", "Tables"]\["Q"]\["N/A"]\0\0\\["Simulation", "2D", "Navigation", "Grid", "MovingGoal"]\["TD"]\1\0\0\["USA"]\["Brigham Young University"]\["Computer Science"]\0\["all"]\["taylor", "mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l"]\\\\\["Grid", "Obstacles"]
2986946894\An Online Search Method for Representative Risky Fault Chains Based on Reinforcement Learning and Knowledge Transfer\Some("https://ieeexplore.ieee.org/document/8890733")\["t"]\["multi"]\["Q"]\["j", "tt", "tr", "ap"]\\["Custom", "CS", "Formulas", "Pseudo", "Figures", "Tables"]\["Q"]\["N/A"]\0\0\\["Simulation", "RealWorld", "Power", "SmartEnergy", "Outage", "Prevention"]\["TD"]\1\1\1\["China", "USA"]\["Tsinghua University", "Argonne National Laboratory", "University of Tennessee"]\["Electrical Engineering", "Electrical Engineering and Computer Sciences"]\0\["h"]\["mag"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
2989824507\Deep Reinforcement Learning for Transfer of Control Policies\Some("https://asmedigitalcollection.asme.org/IDETC-CIE/proceedings/IDETC-CIE2019/59186/V02AT03A003/1069764")\[]\[]\[]\[]\\["Custom", "CS"]\[]\[]\2\2\\["Rotor", "Control", "Aircraft", "Simulation", "Aerial"]\[]\0\1\0\["USA"]\["Penn State University"]\[]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\[]\\[]\\\
2906096117\Transfer Learning Method based on Spreading Activation Model for Reinforcement Learning\Some("https://www.jstage.jst.go.jp/article/jsmermd/2018/0/2018_1A1-C14/_pdf")\[]\[]\[]\[]\\["Custom", "CS"]\[]\[]\2\2\\["Mobile robot", "Cognitive psychology", "Psychology"]\[]\0\1\0\["Japan"]\["Japan Society of Mechanical Engineers "]\[]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\[]\\[]\\\
2803825057\What to Transfer in Lifelong Reinforcement Learning\Some("https://icml.cc/Conferences/2018/AcceptedPapersInitial#402")\[]\[]\[]\[]\\[]\[]\[]\2\2\\[]\[]\0\0\0\[]\[]\[]\2887671224\[]\["mag"]\["t", "r", "l"]\[]\[]\\\\7\
2626333357\Energy saving in heterogeneous cellular network via transfer reinforcement learning based policy\Some("https://ieeexplore.ieee.org/document/7945411/")\["s_i"]\["same_all"]\["pi"]\["ap", "tr"]\\["Custom", "CS", "Formulas", "Figures"]\["Actor-Critic"]\["N/A"]\0\0\\["Energy Saving", "Network", "WiFi", "RealWorld", "Simulation"]\["TD"]\1\1\1\["India"]\["Indraprastha Institute of Information Technology"]\["Electronics and Communication Engineering"]\2604038904\["h"]\["mag"]\["t", "r", "l"]\["r", "l"]\["t", "r", "l"]\\[]\\1\
2607014226\Learning to Predict Consequences as a Method of Knowledge Transfer in Reinforcement Learning\Some("https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7902152")\["s_i", "s_f", "levels"]\["multi"]\["fea", "Q", "predictions"]\["ap", "tr", "tt"]\\["CreateSim", "Custom", "CS", "Pseudo", "Formulas", "Figures"]\["Q"]\["N/A"]\0\0\\["2D", "Simulation", "Navigation", "Maze", "Grid"]\["TD", "MB"]\1\0\0\["Canada"]\["University of Lethbridge"]\["Department of Neuroscience"]\0\["h", "lib"]\["mag", "silva"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\["t", "r", "l", "k", "s"]\\\\\
2979766322\Federated Transfer Reinforcement Learning for Autonomous Driving.\Some("http://arxiv.org/pdf/1910.06001.pdf")\["t"]\["sim2real"]\["pi", "federated"]\["ap", "tr"]\\["Custom", "CS", "fakeOSS", "Stage", "AirSim", "JetsonTX2", "Figures", "Formulas", "Tables"]\["DDPG"]\["N/A"]\0\1\\["Simulation", "RealWorld", "RCCar", "Driving", "Autonomous", "Indoor", "Collision", "Avoidance", "MultiAgent"]\["TD"]\1\0\0\["China"]\["Hong Kong University of Science and Technology", "WeBank"]\["Industry", "Uni", "Colab", "Robotics and MultiPerception Laborotary", "Department of Artificial Intelligence"]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\2\\
2965033324\A Dual Reinforcement Learning Framework for Unsupervised Text Style Transfer\Some("https://doi.org/10.24963/ijcai.2019/711")\[]\["pure"]\[]\[]\\["OSS", "Github", "TensorFlow", "OpenNMT", "Figures", "Formulas", "Pseudo", "Tables"]\["DualRL"]\[]\2\1\\["StyleTransfer", "Text", "NLP", "Simulation"]\["TD"]\1\0\0\["China"]\["WeChat AI", "Tencent"]\["Industry"]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\[]\\\
2027020365\Automated Transfer for Reinforcement Learning Tasks\Some("https://core.ac.uk/display/153339942")\[]\["theory"]\[]\[]\\[]\[]\[]\2\2\\["Theory"]\[]\1\0\0\["USA", "Netherlands", "UK"]\["University of Pennsylvania", "Maastricht University", "University of Liverpool"]\["Computer and Information Science", "Knowledge Engineering"]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
1510402218\Using Options for Knowledge Transfer in Reinforcement Learning\Some("https://www.researchgate.net/profile/Doina_Precup/publication/2609073_Using_Options_for_Knowledge_Transfer_in_Reinforcement_Learning/links/0912f51191d83cea16000000.pdf")\["t"]\["multi"]\["options"]\["tt"]\\["Custom", "CS", "Formulas", "Figures"]\["Q" ]\["N/A"]\0\0\\["MountainCar", "Simulation", "ClassicControl"]\["TD", "H"]\1\0\0\["USA"]\["University of Massachusetts"]\[]\0\["all"]\["taylor", "lazaric", "mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l"]\\[]\\\
2899403804\Design of Transfer Reinforcement Learning Under Low Task Similarity\Some("http://proceedings.asmedigitalcollection.asme.org/proceeding.aspx?articleid=2713207")\["s", "levels"]\["multi"]\["advisor"]\["tt"]\\["Custom", "CS", "Formulas", "Figures", "TensorFlow", "Tables"]\["DQN"]\["N/A"]\0\1\\["2D", "Simulation", "Navigation", "Collision", "Avoidance"]\["TD"]\1\0\0\["USA"]\["University of Southern California"]\["Aerospace & Mechanical Engineering"]\3010746488\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "s", "k"]\\\\5\
1523133203\Enhancing Transfer in Reinforcement Learning by Building Stochastic Models of Robot Actions\Some("https://dl.acm.org/citation.cfm?id=142042")\["r"]\["multi"]\[]\["j"]\\["Custom", "CS"]\["Q"]\[]\2\2\\["2D", "Grid", "Navigation", "Robotics", "Simulation"]\["TD"]\0\1\0\["USA"]\["IBM"]\["Industry"]\0\[]\["mag"]\["t", "r", "l"]\[]\[]\\\\\
2110064866\Transferring Expectations in Model-based Reinforcement Learning\Some("https://www.researchgate.net/profile/Tze-Yun_Leong/publication/264503330_Transferring_Expectations_in_Model-based_Reinforcement_Learning/links/53e1f85d0cf2d79877a9f9b4.pdf?origin=publication_detail")\["s_f", "t"]\["multi"]\["model"]\["tr", "tt"]\\["Custom", "CS", "Formulas", "Pseudo", "Figures", "Tables"]\["DBN"]\["N/A"]\0\1\\["Simulation", "Expectations", "Views", "2D", "Grid", "Navigation", "DynamicBayesianNetwork"]\["TD", "H", "B", "MB"]\1\0\0\["Singapore"]\["National University of Singapore"]\["School of Computing"]\0\["lib"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "s", "k"]\\\\\["Grid"]
1563109146\Shaping in reinforcement learning via knowledge transferred from human-demonstrations\Some("https://ieeexplore.ieee.org/document/7260106/")\["t", "v", "a"]\["multi"]\["r", "rule"]\["j", "tt"]\\["Custom", "CS", "Formulas", "Figures"]\["Q"]\["N/A"]\0\0\\["2D", "3D", "2Dto3D", "MountainCar", "MountainCar3D", "Simulation", "ClassicControl"]\["TD"]\1\1\1\["China"]\["Zhejiang University"]\["Aeronautics and Astronautic"]\0\["h"]\["mag"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\["t", "r", "l", "k", "s"]\\[["FlappyBird", "Maze"]]\\\["Maze", "Grid"]
2896329788\Missile aerodynamic design using reinforcement learning and transfer learning\Some("http://engine.scichina.com/publisher/scp/journal/SCIS/61/11/10.1007/s11432-018-9463-x?slug=fulltext")\["t"]\["same_all"]\["fea"]\["j", "tr", "tt"]\\["Custom", "CS", "Formulas", "Figures"]\["DDPG", "SL-DDPG"]\[]\0\1\\["LiftDragRatio", "Simulation", "DDPG", "Aerodynamic", "Design", "Military", "Weapons"]\["TD"]\1\1\1\["China"]\["Tsinghua University"]\["Computer Science and Technology"]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\[]\\\
2792514232\Flexible Robotic Grasping with Sim-to-Real Transfer based Reinforcement Learning.\Some("http://hdl.handle.net/20.500.11850/322242")\["t"]\["pure", "sim2real"]\["pi"]\["j"]\\["Custom", "CS", "Formulas", "rllab", "PyBullet", "Figures", "Tables"]\["TRPO"]\[]\0\1\\["3D", "CNN", "DepthCamera", "Robotics", "Grasping", "RealWorld", "Simulation"]\["TD"]\1\0\0\["Switzerland"]\["Eidgenössische Technische Hochschule Zürich"]\["Autonomous Systems Lab"]\0\["h"]\["mag"]\["t", "r", "l"]\["r", "l"]\["t", "r", "l", "s", "k"]\\[]\\\
2039109166\Rule abstraction and transfer in reinforcement learning by decision tree\Some("http://www.robot.t.u-tokyo.ac.jp/~yamashita/paper/B/B085Final.pdf")\["s_i", "s_f", "levels"]\["multi"]\["rule"]\["j", "tt"]\\["Custom", "CS", "Formulas", "Figures", "Pseudo", "Tables"]\["Q"]\["N/A"]\0\0\\["2D", "Navigation", "Simulation", "Obstacle", "CollisionAvoidance"]\["TD", "RRL"]\1\0\0\["Japan"]\["University of Tokyo"]\["Precision Engineering"]\0\["h"]\["mag"]\["t", "r", "l"]\["r", "l", "k"]\["t", "r", "l", "k"]\\\\\["Obstacles"]
2550776844\Role of dopamine in learning beyond reinforcement: Boost of auditory learning and transfer by orally taken or gameplay-generated dopamine\Some("https://www.win.ox.ac.uk/publications/925528")\[]\["psychology"]\[]\[]\\[]\[]\[]\2\2\\["Psychology", "Tetris"]\[]\1\1\1\["China"]\["Beijing Normal University"]\[]\0\[]\["mag"]\[]\[]\[]\\\\\
2995352224\Single episode transfer for differing environmental dynamics in reinforcement learning\Some("https://openreview.net/pdf?id=rJeQoCNYDS")\["t"]\["same_all"]\["pi", "VAE"]\["tt", "tr"]\\["OSS", "TensorFlow", "Pseudo", "Formulas", "Figures"]\["DDQN", "SEPT"]\["exp"]\0\1\\["2D", "Simulation", "Navigation", "Grid", "Acrobot", "HIV", "SEPT", "ClassicControl"]\["TD"]\1\0\0\["USA"]\["Georgia Institute of Technology", "Lawrence Livermore National Laboratory"]\["Computer Science"]\0\["lib"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "s", "k"]\\\\\
2515676083\Transferring knowledge from human-demonstration trajectories to reinforcement learning\Some("http://journals.sagepub.com/doi/pdf/10.1177/0142331216649655")\["t"]\["same_all"]\["r", "KNN"]\["j", "ap", "tr", "tt"]\\["Custom", "CS", "Pseudo", "Formulas", "Figures"]\["LSPI", "Q"]\["N/A"]\0\0\\["HumanDemonstration", "KNN", "Frequency", "CartPole", "Simulation", "2D", "ClassicControl"]\["TD", "batch"]\1\1\1\["China"]\["Zheijang University"]\["Aeronautics and Astronautics", "Control Science and Engineering"]\0\["h"]\["mag"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\["t", "r", "l", "k", "s"]\\\\\
1981289969\Nash-reinforcement learning (N-RL) for developing coordination strategies in non-transferable utility games\Some("http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=6974336")\[]\["pure"]\[]\[]\\["Custom", "CS", "Formulas", "Figures", "Tables"]\["NashRL"]\[]\2\0\\["Games", "NTU", "Nash", "Simulation", "Hydropower", "Power", "Electricity", "ElectricityMarket", "MultiAgent"]\["TD"]\1\0\0\["UK", "USA", "Australia"]\["Imperial College London", "University of Central Florida", "The University of Melbourne"]\["Environmental Policy", "Environmental and Construction Engineering"]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l"]\\\\\
2085926075\Errorless Learning: Reinforcement Contingencies and Stimulus Control Transfer in Delayed Prompting.\Some("https://onlinelibrary.wiley.com/doi/10.1901/jaba.1984.17-175/references")\[]\[]\[]\[]\\[]\[]\[]\2\2\\["Neurosciences", "Neurology"]\[]\1\0\0\["USA"]\["Harvard Medical School", "California State College"]\["Neurology"]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l"]\\\\\
115717799\Transfer learning for reinforcement learning through goal and policy parametrization\Some("https://lirias.kuleuven.be/bitstream/123456789/131381/1/2006_wtl_driessens.pdf")\[]\["theory"]\[]\[]\\[]\[]\[]\2\2\\["Report", "State-of-the-Art"]\[]\1\0\0\["Belgium"]\["Katholieke Universiteit Leuven"]\["Computer Science"]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "s", "k"]\\\\\
2803180393\Importance Weighted Transfer of Samples in Reinforcement Learning\Some("http://export.arxiv.org/abs/1805.10886")\["t", "s_f"]\["multi"]\["pi"]\["j", "tr", "tt"]\\["Custom", "CS", "Formulas", "Pseudo", "Theorem", "Lemma", "Tables"]\["FQI", "RBT", "SDT", "IWFQI"]\["exp"]\1\0\\["Simulation", "Puddleworld", "WaterReservoirControl", "Acrobot", "ClassicControl"]\["TD", "MB", "batch"]\1\0\0\["Italy", "France"]\["Politecnico di Milano", "INRIA Lille"]\["Computer Science"]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\\\\
2910593956\Design of Transfer Reinforcement Learning Mechanisms for Autonomous Collision Avoidance\Some("https://link.springer.com/chapter/10.1007/978-3-030-05363-5_17")\["s", "levels", "#"]\["multi"]\["advisor"]\["tt"]\\["Custom", "CS", "Formulas", "Figures", "TensorFlow", "Tables"]\["DQN"]\["N/A"]\0\1\\["2D", "Simulation", "Navigation", "Collision", "Avoidance"]\["TD"]\1\0\0\["USA"]\["University of Southern California"]\["Aerospace & Mechanical Engineering"]\3010746488\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k", "s"]\\\\5\
3014664339\DECAF: Deep Case-based Policy Inference for knowledge transfer in Reinforcement Learning\Some("https://www.sciencedirect.com/science/article/pii/S095741742030244X")\["s", "s_i", "s_f", "t"]\["multi", "lit"]\["PolicyLibrary"]\["tt"]\\["OSS", "TensorFlow", "Gym", "OpenCV", "Formulas", "Figures", "Pseudo"]\["Q", "DQN", "PLBI", "CBPI", "DECAF", "A2T"]\["N/A"]\0\1\\["Simulation", "2D", "Navigation", "Grid", "Games", "VideoGames", "Atari"]\["CBR", "TD"]\1\1\1\["Brazil"]\["University of Sao Paulo", "FEI’s University Centre"]\[]\0\["h", "lib"]\["mag"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
2978426310\Strategy Selection in Complex Game Environments Based on Transfer Reinforcement Learning\Some("https://ieeexplore.ieee.org/document/8852019")\["a", "r", "t", "#"]\["lit"]\["advisor"]\["tt"]\\["Custom", "CS", "Formulas", "Figures", "Pseudo", "Tables"]\["A3C"]\["N/A"]\0\1\\["2D", "Games", "VideoGames", "Atari", "Simulation", "Imitation"]\["TD"]\1\1\1\["China", "Canada"]\["Dalian University of Technology", "McGill University"]\["Computer Science and Technology"]\0\["lib"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
2810881727\Transfer with Model Features in Reinforcement Learning.\Some("https://arxiv.org/pdf/1807.01736")\["r", "t"]\["multi"]\["model", "fea"]\["tt"]\\["Custom", "CS", "Formulas", "Figures", "Pseudo", "Theorem", "Lemma"]\["Q", "Random"]\["exp"]\0\0\\["2D", "Navigation", "Grid", "Simulation"]\["TD"]\1\0\0\["USA"]\["Brown University"]\["Computer Science"]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\2\\
929682\Speeding-up reinforcement learning through abstraction and transfer learning\Some("https://dl.acm.org/citation.cfm?id=2484942")\["s_i", "s_f"]\["multi"]\["pi"]\["tr", "ap"]\\["Custom", "CS", "Formulas", "Pseudo", "Figures"]\["S2L-RL", "Q"]\["N/A"]\0\0\\["2D", "Navigation", "Grid", "Simulation", "Robotics"]\["TD", "RRL"]\1\0\0\["Brazil"]\["Universidade de São Paulo"]\[]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\["Grid", "Rooms", "SingleLevel"]
2604402446\Accelerating Multiagent Reinforcement Learning through Transfer Learning.\Some("https://dblp.uni-trier.de/db/conf/aaai/aaai2017.html#SilvaC17")\[]\["theory"]\[]\[]\\[]\[]\[]\2\2\\["Theory", "MultiAgent"]\[]\1\0\0\["Brazil"]\["Universidade de São Paulo"]\["Escola Politécnica"]\0\[]\["silva", "mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k", "s"]\\\\\
2812068818\Optimizing data transfers for improved performance on shared GPUs using reinforcement learning\Some("https://experts.syr.edu/en/publications/optimizing-data-transfers-for-improved-performance-on-shared-gpus")\[]\[]\[]\[]\\[]\[]\[]\2\2\\["Performance", "Optimization", "GPGPU", "Simulation", "MonteCarlo", "unrelated"]\[]\1\1\1\["USA"]\["Air Force Research Laboratory", "Syracuse University"]\["Information"]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l"]\\[]\\\
2018629084\Towards reinforcement learning representation transfer\Some("https://dl.acm.org/citation.cfm?id=1329125.1329248")\["a", "v"]\["diff-it"]\["Q"]\["tt"]\\["Custom", "CS", "Pseudo", "Formulas", "Figures", "Web", "Dead"]\["SARSA"]\["sup"]\0\0\\["Simulation", "RoboCup", "KeepAway", "MultiAgent", "3v2"]\["TD"]\1\0\0\["USA"]\["The University of Texas"]\["Computer Science"]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
2972017633\Transferring optimal contact skills to flexible manipulators by reinforcement learning\Some("https://dblp.uni-trier.de/db/journals/ijira/ijira3.html#XuPR19")\[]\["pure"]\[]\[]\\["Custom", "CS", "Formulas", "Figures", "Videos", "MIPS"]\["DDPG"]\[]\2\1\\["Robotics", "RealWorld", "Palpation", "Manipulators"]\["TD"]\1\1\1\["Singapore", "China"]\["National University of Singapore", "Tongji University"]\["Biomedical Engineering", "Control Science & Engineering"]\0\[]\["mag"]\["t", "r", "l", "s"]\["r", "l", "s"]\["t", "r", "l", "s", "k"]\\[]\\\
2240785302\Collaborative reinforcement learning for a two-robot job transfer flow-shop scheduling problem\Some("http://www.tandfonline.com/doi/full/10.1080/00207543.2015.1057297")\["t"]\["multi"]\["pi"]\["tr"]\\["Custom", "CS", "Formulas", "Figures", "Tables"]\["Q"]\["N/A"]\0\0\\["Simulation", "Robotics", "FlowShop", "Collaboration", "Scheduling"]\["TD"]\1\0\0\["Israel"]\["Ben-Gurion University of the Negev"]\["Industrial Engineering and Management"]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l"]\\\\\
2969408081\Sim-to-real transfer reinforcement learning for control of thermal effects of an atmospheric pressure plasma jet\Some("https://ui.adsabs.harvard.edu/abs/2019PSST...28i5019W/abstract")\["t"]\["pure", "sim2real"]\["pi"]\["tt"]\\["Custom", "CS", "Formulas", "Figures", "Tables", "Pseudo"]\["RLC", "G-RLC"]\[]\2\1\\["Robotics", "FeedbackControl", "AtmosphericPressurePlasma", "RealWorld"]\["TD"]\1\0\0\["USA", "Switzerland"]\["University of California", "Ecole Polytechnique Fédérale de Lausanne"]\["Chemical and Biomolecular Engineering", "Molecular Simulation"]\0\[]\["mag"]\["t", "r", "l"]\["r", "l"]\["t", "r", "l"]\\[]\\\
2344013593\Relational transfer in reinforcement learning\Some("https://minds.wisconsin.edu/handle/1793/60678")\["#", "t", "a", "v"]\["multi"]\["advisor", "rule", "Q", "MLN"]\["j", "tt", "ap", "tr"]\\["Custom", "CS", "Formulas", "Figures", "Tables", "Pseudo", "Web", "Dead"]\["Q", "MLN"]\["sup"]\0\0\\["Simulation", "RoboCup", "MultiAgent", "KeepAway", "BreakAway", "MoveDownfield", "2v1", "3v2", "4v3"]\["TD", "RRL", "B"]\1\0\0\["USA"]\["University of Wisconsin"]\["Computer Sciences"]\1\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k", "s"]\\\\\
2906148830\Relationship Between the Order for Motor Skill Transfer and Motion Complexity in Reinforcement Learning\Some("https://doi.org/10.1109/LRA.2018.2889026")\["s", "t"]\["sim2real", "curriculum", "multi"]\["pi"]\["j", "tt"]\\["Custom", "CS", "Formulas", "Figures", "Pseudo", "Tables"]\["Q", "PoWER"]\["N/A"]\0\0\\["Robotics", "Simulation", "RealWorld", "MotorSkill", "Demonstration", "Expert", "Manipulation", "Drawing", "Fitting"]\["TD", "B", "MB"]\1\1\1\["South Korea"]\["Hanyang University", "Korea Institute of Industrial Technology"]\["Department of Electronics and Computer Engineering", "Smart Research Group"]\0\["h"]\["mag"]\["t", "r", "l", "s"]\["t", "r", "l", "s"]\["t", "r", "l", "s", "k"]\\\\\
2148179544\Structural knowledge transfer by spatial abstraction for reinforcement learning agents\Some("https://dblp.uni-trier.de/db/journals/adb/adb18.html#FrommbergerW10")\["t", "s_i", "s_f"]\["pure", "sim2real"]\["fea", "Q"]\["j", "tt", "tr", "ap"]\\["Custom", "CS", "Formulas", "Pseudo", "Theorem", "Figures", "Tables", "Open-SLAM"]\["APSST", "Q"]\[]\0\0\\["Robotics", "Navigation", "RealWorld", "LRF", "Simulation"]\["TD"]\1\0\0\["Germany"]\["University of Bremen"]\["Cognitive Systems"]\0\[]\["mag"]\["t", "r", "l", "k"]\["t", "r", "l", "s", "k"]\["t", "r", "l", "s", "k"]\\\\\["Obstacles"]
2756467676\Transfer learning via linear multi-variable mapping under reinforcement learning framework\Some("http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=8028754")\["#", "v"]\["multi"]\["Q"]\["tt"]\\["Custom", "CS", "Formulas", "Figures", "Tables"]\["SARSA"]\["svg", "Ma"]\0\0\\["2D", "Simulation", "KeepAway", "Navigation", "MultiAgent", "RoboCup"]\["TD"]\1\0\0\["China"]\["National University of Defense Technology"]\["College of Mechatronics and Automation"]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
2093192040\Transfer in inverse reinforcement learning for multiple strategies\Some("http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6696817")\["s_i", "s_f", "levels"]\["multi"]\["advisor"]\["tt", "j"]\\["Custom", "CS", "Formulas", "Figures", "Pseudo", "Lemma"]\["StochasticPolicy"]\["N/A"]\0\0\\["Robotics", "Simulation", "MiniGolf", "Arm", "2D", "Navigation", "Grid"]\["TD"]\1\0\0\["Switzerland"]\["Ecole Polytechnique Federale de Lausann"]\["Learning Algorithms and Systems Laboratory (LASA)"]\0\["all"]\["mag"]\["t", "r", "l"]\["t", "l", "k"]\["t", "r", "l", "k", "s"]\\\\\["Grid"]
1912083604\Autonomous inter-task transfer in reinforcement learning domains\Some("http://www.dtic.mil/dtic/tr/fulltext/u2/1024624.pdf")\["v", "a", "#", "t", "s_i", "s_f", "s", "r"]\["same_all", "multi", "diff-no", "diff-it", "lit"]\["pi", "rule", "advisor", "Q"]\["j", "tt", "tr", "ap"]\\["Custom", "CS", "Formulas", "Figures", "Tables", "Pseudo", "KeepAway", "Web", "Dead"]\["Q", "SARSA", "NEAT", "FRM"]\["exp", "sup", "svg", "Ma"]\1\0\\["Simulation", "2D", "3D", "2Dto3D", "MountainCar", "MountainCar3D", "ServerJob", "Scheduling", "KeepAway", "Ringworld", "KnightJoust", "MultiAgent", "ClassicControl", "RoboCup"]\["TD", "PS", "MB", "RRL"]\1\0\0\["USA"]\["The University of Texas"]\["Computer Science"]\1\["h", "all", "lib", "mod"]\["mag", "bone"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k", "s"]\\\\\
1481405077\Reinforcement learning transfer via common subspaces\Some("https://dblp.uni-trier.de/db/conf/atal/ala2011.html#Bou-AmmarT11")\["v", "a", "#", "t"]\["lit"]\["fea", "sub"]\["j", "tt"]\\["Custom", "CS", "Formulas", "Pseudo", "Figures", "Tables"]\["FVI"]\["exp"]\1\0\\["2D", "CartPole", "Swingup", "MassSystem", "Simulation", "ClassicControl"]\["TD"]\1\1\1\["Netherlands", "USA"]\["Maastricht University", "Lafayette College"]\[]\0\["h"]\["mag", "zhu"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\[["SingleMass", "DoubleMass"], ["Pendulum", "CartpoleSwingup"]]\\\
2998488389\Compositional Transfer in Hierarchical Reinforcement Learning\Some("https://arxiv.org/abs/1906.11228")\["t", "#", "s_i", "s_f", "s"]\["multi", "sim2real"]\["off", "sub"]\["tt", "tr", "j"]\\["Custom", "CS", "Pseudo", "Formulas", "Figures", "TensorFlow", "Videos", "Web"]\["RHPO", "SAC"]\["N/A"]\0\1\\["3D", "Robotics", "Simulation", "RealWorld", "Lift", "Stack", "Pile"]\["H", "TD"]\1\0\0\["UK"]\["DeepMind"]\["Industry"]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k", "s"]\\\\\
3017464978\Sonic to knuckles: Evaluations on transfer reinforcement learning\Some("https://www.spiedigitallibrary.org/conference-proceedings-of-spie/11425/114250J/Sonic-to-knuckles-Evaluations-on-transfer-reinforcement-learning/10.1117/12.2559546.full")\[]\[]\[]\[]\\["Custom", "CS"]\["PPO"]\[]\2\1\\["Control System", "Games", "VideoGames", "Sonic3", "Simulation", "Military"]\["TD"]\0\1\0\["USA"]\["Vanderbilt University", "Ohio University", "Air Force Research Lab"]\[]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\[]\\[]\\\
2150385772\Reinforcement learning transfer based on subgoal discovery and subtask similarity\Some("https://ieeexplore.ieee.org/xpl/articleDetails.jsp?tp=&arnumber=7004683")\["s", "t", "levels"]\["multi"]\["sub", "options"]\["j", "tt"]\\["Custom", "CS", "Pseudo", "Formulas", "Figures", "Tables"]\["Q", "SARSA", "SDHRL"]\["N/A"]\0\0\\["2D", "Navigation", "Maze", "Grid", "Simulation"]\["TD", "H"]\1\0\0\["China"]\["Nanjing University"]\["Computer Science and Technology"]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\\\\["Grid", "Maze"]
2887063207\Intelligent Land-Vehicle Model Transfer Trajectory Planning Method Based on Deep Reinforcement Learning.\Some("https://www.preprints.org/manuscript/201808.0049/v1/download")\["t"]\["sim2real", "multi", "lit"]\["model"]\[]\\["Custom", "CS", "Pseudo", "Formulas", "Figures", "TensorFlow"]\["DDPG", "DRL-MTTP", "DQN"]\["N/A"]\0\1\\["3D", "Navigation", "Driving", "Trajectory", "Planning", "Autonomous", "Mobility", "Simulation", "RealWorld", "Vehicle"]\["TD"]\1\0\0\["China"]\["Central South University", "Harbin Institute of Technology", "Chongqing University", "Hunan University of Commerce"]\["Information Science and Engineering", "State Key Laboratory of Robotics and System", "State Key Laboratory of Mechanical Transmissions", "School of Computer and Information Engineering"]\0\["all"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l"]\\\\\
1931215804\Transferring evolved reservoir features in reinforcement learning tasks\Some("https://rd.springer.com/chapter/10.1007%2F978-3-642-29946-9_22")\["v"]\["lit"]\["fea"]\["j", "tr", "tt"]\\["Custom", "CS", "Formulas", "Figures", "Tables", "RL-Glue", "RL-Glue(deadlinks)"]\["NEAT"]\["exp"]\0\1\\["Simulation", "MountainCar", "2D", "3D", "2Dto3D", "Scheduling", "ServerJob", "ServerJobScheduling", "ClassicControl"]\["TD", "PS"]\1\0\0\["Greece"]\["Aristotle University of Thessaloniki", "Centre for Research and Technology Hellas"]\["Electrical & Computer Engineering", "Informatics and Telematics"]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
2604569119\Improving Deep Reinforcement Learning with Knowledge Transfer.\Some("https://dblp.uni-trier.de/db/conf/aaai/aaai2017.html#GlattC17a")\[]\["theory"]\[]\[]\\[]\["DQN"]\[]\2\2\\["Theory"]\["TD"]\1\0\0\["Brazil"]\["Universidade de São Paulo"]\["Escola Politécnica"]\0\[]\["mag", "multi"]\["t", "r", "l", "k"]\["t", "r", "l", "s"]\["t", "r", "l", "k", "s"]\\\\\
2808217720\Knowledge transfer in reinforcement learning\Some("https://www.didaktorika.gr/eadd/handle/10442/38511")\["s_i", "s_f", "t", "r"]\["same_all", "multi"]\["I", "fea"]\["j", "tr", "ap"]\\["Custom", "CS", "Pseudo", "Figures", "Tables", "Formulas"]\["Q", "MAXQ", "FQI", "LSPI", "LSTDQ"]\["N/A"]\0\0\\["MiniGolf", "MountainCar", "Boat", "2D", "Navigation", "Simulation", "ClassicControl"]\["batch", "H", "B", "TD"]\1\1\0\["Italy"]\["Politecnico di Milano"]\["Electronics and Information"]\1\["lib", "all"]\["taylor", "lazaric", "mag"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
2100814505\The Perceived Impacts of Supervisor Reinforcement and Learning Objective Importance on Transfer of Training\Some("https://eric.ed.gov/?id=EJ597495")\[]\["unrelated"]\[]\[]\\[]\[]\[]\2\2\\["Finance", "Human", "Training", "unrelated"]\[]\1\1\1\["USA"]\["University of Minnesota", "Vital Consulting Group"]\[]\0\[]\["mag"]\[]\[]\[]\\\\\
2940691063\Human Causal Transfer: Challenges for Deep Reinforcement Learning.\Some("https://mindmodeling.org/cogsci2018/papers/0080/index.html")\[]\["psychology", "pure"]\[]\[]\\["Custom", "CS", "Gym", "Videos", "Formulas", "Figures"]\["DDQN"]\[]\0\1\\["2D", "Navigation", "Grid", "Simulation", "Human", "Causal", "Comparison"]\["TD"]\1\0\0\["USA"]\["UCLA", "Caltech", "UW"]\["Computer Science", "Psychology", "Jet Propulsion Laboratory", "Statistics"]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
2125946670\A framework for the adaptive transfer of robot skill knowledge using reinforcement learning agents\Some("https://dblp.uni-trier.de/db/conf/icra/icra2001.html#MalakK01")\["s_i", "s_f", "s", "levels"]\["multi"]\["advisor"]\["tt"]\\["Custom", "CS", "Formulas", "Figures"]\["Q"]\["exp"]\0\0\\["2D", "Navigation", "Grid", "Maze", "Robotics", "Simulation"]\["TD"]\1\0\0\["USA"]\["Carnegie Mellon University"]\["Electrical and Computer Engineering"]\0\["lib"]\["mag"]\["t", "r", "l", "k", "s"]\["t", "r", "l", "k", "s"]\["t", "r", "l", "k", "s"]\\\\\["Grid", "Maze"]
2967621954\UAV Trajectory Design Based on Reinforcement Learning for Wireless Power Transfer\Some("https://ieeexplore.ieee.org/document/8793294")\[]\["pure"]\[]\[]\\["Custom", "CS", "Formulas", "Figures"]\["Q"]\[]\2\0\\["Wireless Power Transfer", "SmartEnergy", "PureRL", "Drones", "UAV", "Vehicle", "Aerial", "Simulation"]\[]\1\1\1\["South Korea"]\["Yonsei University"]\["Electrical and Electronics Engineering"]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l"]\\[]\\\
2398715346\Advice Taking and Transfer Learning: Naturally Inspired Extensions to Reinforcement Learning.\Some("https://experts.umn.edu/en/publications/advice-taking-and-transfer-learning-naturally-inspired-extensions")\["#"]\["multi"]\["rule", "advisor"]\["j"]\\["Custom", "CS", "Formulas", "Figures", "Tables"]\["Q"]\["N/A"]\0\0\\["2D", "Navigation", "RoboCup", "BreakAway", "MultiAgent", "Simulation", "3v2", "2v1"]\["TD", "RRL"]\1\0\0\["USA"]\["University of Wisconsin", "University of Minnesota"]\["Computer Sciences"]\0\[]\["mag"]\["t", "r", "l"]\["t", "l"]\["t", "r", "l", "k", "s"]\\[["BreakAway2v1", "BreakAway3v2"]]\\\
1573527757\Transfer of task representation in reinforcement learning using policy-based proto-value functions\Some("http://doi.acm.org/10.1145/1402821.1402864")\["s_f", "t"]\["multi"]\["pvf"]\["j", "tr", "ap", "tt"]\\["Custom", "CS", "Formulas", "Pseudo", "Figures", "Tables"]\["Q"]\["N/A"]\0\0\\["2D", "Navigation", "Simulation", "Grid", "ThreeRoomMaze"]\["TD"]\1\0\0\["Italy"]\["Politecnico di Milano"]\[]\0\[]\["mag", "lazaric"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k", "s"]\\\\\["ThreeRoomMaze", "Rooms"]
1255659923\Fixed vs. Dynamic Sub-Transfer in Reinforcement Learning.\Some("https://dblp.uni-trier.de/db/conf/icmla/icmla2002.html#CarrollP02")\["s_i", "s_f", "s", "levels"]\["multi"]\["pi_fix", "pi", "pi_dyn"]\["tr", "ap", "j"]\\["Custom", "CS", "Formulas", "Figures"]\["Q"]\["N/A"]\0\0\\["2D", "Navigation", "Simulation", "Maze", "Grid", "ThreeRoomMaze"]\["TD"]\1\0\0\["USA"]\["Brigham Young University"]\[]\0\[]\["mag"]\["t", "r", "l"]\["t", "l"]\["t", "r", "l", "s"]\\\\\["Grid", "ThreeRoomMaze", "Rooms"]
2945961113\Effects of Task Similarity on Policy Transfer with Selective Exploration in Reinforcement Learning\Some("https://dl.acm.org/citation.cfm?id=3332034")\["t", "s_i", "s_f", "s", "levels"]\["multi"]\["model"]\["j", "tr"]\\["Custom", "CS", "Formulas", "Figures"]\["Q"]\["N/A"]\0\0\\["2D", "Navigation", "Simulation", "Taxi", "Grid"]\["TD", "MB"]\1\0\0\["Singapore"]\["National Unversity of Singapore"]\["Computing"]\0\["lib"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
2885067833\Concept-Aware Feature Extraction for Knowledge Transfer in Reinforcement Learning.\Some("https://dblp.uni-trier.de/db/conf/aaai/aaai2018w.html#Winderd18")\[]\["unavailable"]\["fea"]\[]\\[]\[]\["exp"]\0\2\\["ConceptAware", "Features", "FeatureExtraction"]\[]\0\0\0\[]\[]\[]\0\[]\["mag"]\[]\[]\[]\\\\\
2991117878\Playing Games in the Dark: An approach for cross-modality transfer in reinforcement learning.\Some("https://arxiv.org/pdf/1911.12851.pdf")\["v"]\["same_all"]\["model"]\["tr"]\\["Custom", "CS", "Formulas", "PyTorch", "Gym", "Figures", "Tables"]\["DQN", "AVAE"]\[]\1\1\\["Input Abstraction", "VideoToSound", "Atari", "Simulation", "VideoGames", "Games"]\["TD"]\1\0\0\["Portugal", "USA"]\["INESC-ID", "Instituto Superior Técnico", "Carnegie Mellon University"]\["Computer Science"]\0\["h"]\["mag"]\["t", "r", "l"]\["r", "l"]\["t", "r", "l"]\\[["AtariImage", "AtariSound"]]\\\
2942204155\Transferring Task Goals via Hierarchical Reinforcement Learning\Some("https://openreview.net/pdf?id=S1Y6TtJvG")\["t", "a"]\["multi"]\["r", "Hierarchical"]\["j", "tr", "ap", "tt"]\\["Custom", "CS", "Formulas", "Mujoco", "Figures", "Videos"]\["HRL"]\["exp"]\1\1\\["2D", "3D", "Robotics", "Navigation", "Simulation", "Ball", "Ant"]\["TD", "H"]\1\0\0\["USA", "UK"]\["University of California", "DeepMind"]\["Industry", "Uni", "Colab"]\0\["h"]\["mag"]\["t", "r", "l"]\[]\["t", "r", "l", "k"]\\\1\\
1990654808\Context transfer in reinforcement learning using action-value functions\Some("https://core.ac.uk/display/88336304")\["v", "a"]\["diff-no"]\["fea", "Q"]\["j", "tt", "ap", "tr"]\\["Custom", "CS", "Formulas", "Theorem", "Figures"]\["Q"]\["sup"]\0\0\\["Simulation", "2D", "Navigation", "Grid", "Collect", "Robots", "CrossroadTrafficController"]\["TD"]\1\0\0\["Iran"]\["University of Tehran", "Institute for Research in Fundamental Sciences"]\["Cognitive Robotics Lab", "College of Engineering", "Electrical and Computer Engineering"]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\["Grid", "Obstacles"]
112971866\Transfer in Reinforcement Learning Domains\Some("https://doi.org/10.1007/978-3-642-01882-4")\[]\["theory", "book"]\[]\[]\\[]\[]\[]\2\2\\["Theory"]\[]\1\1\1\["USA"]\["University of Southern California"]\["Computer Science"]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\[]\\\\\
2910911697\Boosting Reinforcement Learning in Competitive Influence Maximization with Transfer Learning\Some("https://dblp.uni-trier.de/db/conf/webi/webi2018.html#AliWC18")\["s", "s_i", "s_f"]\["same_all"]\["pi", "Q"]\["tt"]\\["Custom", "CS", "graph-tool", "Formulas", "Figures"]\["Q"]\["N/A"]\0\0\\["Simulation", "SocialNetworks", "Social", "Graph", "Influence", "Maximization"]\["TD"]\1\1\1\["Taiwan"]\["Academia Sinica", "National Tsing Hua University"]\["Social Networks and Human-Centered Computing", "Information Science", "Computer Science"]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
2990442018\Green wireless power transfer system for a drone fleet managed by reinforcement learning in smart industry\Some("https://pubag.nal.usda.gov/catalog/6767089")\[]\["pure"]\[]\[]\\["Custom", "CS", "Formulas", "Figures", "Tables"]\[]\[]\2\0\\["Wireless Power Transfer", "SmartEnergy", "PureRL", "Drones", "UAV", "Vehicle", "Aerial", "Simulation"]\[]\1\0\0\["Italy"]\["University of Catania"]\["Electronics and Information"]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\[]\\\
3008535267\"Good Robot!": Efficient Reinforcement Learning for Multi-Step Visual Tasks with Sim to Real Transfer.\Some("http://export.arxiv.org/abs/1909.11730")\["t", "s", "s_i", "s_f"]\["sim2real", "multi"]\["r", "ExperienceReplay"]\["tr", "ap"]\\["OSS", "Custom", "PyTorch", "Gym", "Figures", "Formulas", "Pseudo", "Tables", "Videos"]\["VPG", "DQN"]\["N/A"]\0\1\\["3D", "VisualGrasping", "Robotics", "Simulation", "RealWorld", "Arm", "2D", "Navigation", "Grid"]\["TD"]\1\0\0\["USA"]\["The Johns Hpkins University", "NVIDIA"]\["Industry", "Uni", "Colab"]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k", "s"]\\\\\
2982398001\A Building Energy Consumption Prediction Method Based on Integration of a Deep Neural Network and Transfer Reinforcement Learning\Some("https://www.worldscientific.com/doi/10.1142/S0218001420520059")\["s"]\["same_all"]\["fea"]\["ap"]\\["Custom", "CS", "Formulas", "Figures", "Pseudo", "Tables"]\["DQN"]\["N/A"]\0\1\\["SmartEnergy", "Power", "Consumption", "Prediction", "Simulation"]\["TD"]\1\1\1\["China", "Canada"]\["Suzhou University of Science and Technology", "McMaster University"]\["Electronics and Infromation Engineering", "Key Laboratory of Intelligent Building Energy Efficiency", "Engineering"]\0\["lib"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k", "s"]\\\\\
2965163470\Value Function Transfer for Deep Multi-Agent Reinforcement Learning Based on N-Step Returns\Some("https://www.ijcai.org/Proceedings/2019/65")\["t", "s_i", "s_f", "s", "levels"]\["multi"]\["Q"]\["j", "tr", "tt", "ap"]\\["Custom", "CS", "Formulas", "Figures", "Pseudo"]\["Q", "DQN"]\["N/A"]\0\1\\["2D", "Navigation", "Maze", "MARL", "Atari", "Games", "Simulation", "MultiAgent", "Pacman", "PredatorPrey"]\["TD"]\1\0\0\["China"]\["Nanjing University", "Netease"]\["National Key Laboratory for Novel Software Technology", "Fuxi AI Lab"]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
3006698967\A Dynamic Financial Knowledge Graph Based on Reinforcement Learning and Transfer Learning\Some("https://doi.org/10.1109/BigData47090.2019.9005691")\[]\["same_all"]\["KnowledgeGraph"]\["tr"]\\["Custom", "CS", "Figures", "Tables"]\["BERT"]\[]\0\1\\["Finance", "Simulation", "KnowledgeGraph", "CNN", "NLP"]\["TD"]\1\1\1\["China"]\["Peking University", "Beijing Normal University"]\["EECS", "Government"]\0\[]\["mag"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
2971274224\A Study on Efficient Transfer Learning for Reinforcement Learning Using Sparse Coding\Some("http://www.joace.org/uploadfile/2015/1023/20151023022908338.pdf")\["s_i", "s_f", "s", "levels"]\["multi"]\["sparse_coding"]\["j", "tt"]\\["Custom", "CS", "Formulas", "Figures", "Tables"]\["Q"]\["N/A"]\0\0\\["2D", "Navigation", "Maze", "Simulation"]\["TD"]\1\0\0\["Japan"]\["Ochanomizu University"]\["School of Humanities and Sciences"]\0\["lib"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\\\\
2900640207\ASD: A Framework for Generation of Task Hierarchies for Transfer in Reinforcement Learning\Some("https://dblp.uni-trier.de/db/conf/iconip/iconip2018-3.html#GoyalMNR18")\[]\["theory"]\[]\[]\\["Custom", "CS", "Pseudo", "Figures"]\["HRL"]\["N/A"]\0\0\\["2D", "Navigation", "Grid", "Taxi", "Wargus", "Collect", "MultiAgent", "Simulation"]\["TD", "H"]\1\0\0\["India", "Singapore"]\["IIIT Bangalore", "National University of Singapore"]\["International Institute of Information Technology"]\0\[]\["mag"]\["t", "r", "l"]\["r", "l"]\["t", "r", "l", "k"]\\\\\
2986543185\Fuzzy Reinforcement Learning and Curriculum Transfer Learning for Micromanagement in Multi-Robot Confrontation\Some("https://doi.org/10.3390/info10110341")\["#", "s_i", "s_f", "s"]\["same_all"]\["curriculum"]\["j", "tt"]\\["Custom", "CS", "Robocode", "Formulas", "Figures", "Pseudo", "Tables"]\["SSAQ", "Q"]\["N/A"]\0\1\\["2D", "Robotics", "RoboCode", "Simulation", "SharedParameters", "MultiAgent"]\["TD", "MB"]\1\0\0\["China"]\["Hubei University of Arts and Science", "Northwestern Polytechnical University"]\["Computer Engineering", "Computer Science"]\0\["all"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\\\\
2964998356\Skill based transfer learning with domain adaptation for continuous reinforcement learning domains\Some("https://dblp.uni-trier.de/db/journals/apin/apin50.html#ShoelehA20")\["s_i", "s_f", "s", "levels"]\["multi"]\["Q", "graph"]\["tt", "ap", "tr"]\\["Custom", "CS", "Formulas", "Pseudo", "Figures", "Tables"]\["GSL", "SARSA", "TLDA"]\["N/A"]\0\0\\["Simulation", "2D", "Pinball", "GraphBased", "MountainCar", "3D", "MountainCar3D", "MountainCar2Dto3D", "ClassicControl"]\["TD", "H"]\1\1\1\["Iran"]\["University of Tehran"]\["Electrical and Computer Engineering"]\0\["all"]\["mag"]\["t", "r", "l", "s"]\["t", "r", "l", "s", "k"]\["t", "r", "l", "s", "k"]\\\\\
2976782275\MULTIPOLAR: Multi-Source Policy Aggregation for Transfer Reinforcement Learning between Diverse Environmental Dynamics\Some("https://arxiv.org/pdf/1909.13111")\["v", "a", "r", "t", "s", "s_i", "s_f"]\["diff-no", "lit"]\["pi", "pi_gen", "distil"]\["j", "tr"]\\["Custom", "CS", "bootstrapped", "Gym", "stable-baselines", "Figures", "Formulas", "Tables"]\["MLP", "RPL", "MULTIPOLAR"]\["N/A"]\1\1\\["2D", "Robotics", "Simulation", "Gym", "CartPole", "Acrobot", "LunarLanderContinuous", "Mujoco", "Hopper", "Ant", "InvertedPendulum", "ClassicControl"]\["TD"]\1\0\0\["Germany", "Japan"]\["Technical University of Munich", "OMRON SINIC X"]\["Industry", "Uni", "Colab"]\0\["lib"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
2562252866\Effective Transfer via Demonstrations in Reinforcement Learning: A Preliminary Study\Some("https://dblp.uni-trier.de/db/conf/aaaiss/aaaiss2016.html#WangT16")\[]\["multi"]\["rule"]\["j", "tr", "tt"]\\["Custom", "CS", "Pseudo", "Figures", "Formulas", "Tables"]\["Q"]\["N/A"]\0\0\\["2D", "Simulation", "Navigation", "KeepAway", "MultiAgent", "RoboCup"]\["TD"]\1\0\0\["USA"]\["Washington State University"]\["EECS"]\0\["all"]\["mag"]\["t", "r", "l"]\["t", "l", "k"]\["t", "r", "l", "k"]\\\\\
2977949361\Manufacturing Dispatching using Reinforcement and Transfer Learning.\Some("http://arxiv.org/pdf/1910.02035.pdf")\["#", "t", "s", "s_i"]\["same_all"]\["pi", "manifold"]\["j", "tr", "tt"]\\["Custom", "CS", "Pseudo", "Formulas", "Figures", "Tables"]\["DMD"]\["N/A"]\0\1\\["Scheduling", "Dispatching", "Simulation"]\["TD"]\1\0\0\["USA"]\["Hitachi Industrial AI Lab"]\["Industry"]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l"]\\[]\\\
2524747780\Learning a transfer function for reinforcement learning problems\Some("https://lirias.kuleuven.be/bitstream/123456789/203706/1/Croonenborghs.pdf")\["s", "levels"]\["same_all"]\["Q"]\["ap", "tr", "tt"]\\["Custom", "CS", "Pseudo", "Formulas", "Figures"]\["Q", "SARSA", "TILDE"]\["N/A"]\0\0\\["2D", "Navigation", "Grid", "Simulation"]\["TD"]\1\0\0\["Belgium"]\["KH Kempen University College", "Katholiege Universiteit Leuven"]\["Biosciences and Technology", "Declarative Languages and Artifical Intelligence"]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
2070732802\The role of temporal statistics in the transfer of experience in context-dependent reinforcement learning\Some("https://ieeexplore.ieee.org/document/7086184/")\[]\[]\[]\[]\\["Custom", "CS"]\[]\[]\2\2\\["ANN", "HIS", "TemporalCredit", "Simulation"]\[]\1\1\1\["Kuwait"]\["Arab Open University"]\["Computer Studies"]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k", "s"]\\\\\
1569756368\Reinforcement learning transfer using a sparse coded inter-task mapping\Some("https://core.ac.uk/display/158764893")\["t", "a", "v"]\["diff-no", "lit"]\["Q", "I"]\["tt", "tr"]\\["Custom", "CS", "Formulas", "Figures", "Pseudo"]\["LSPI", "FQI"]\["exp"]\0\0\\["2D", "Simulation", "MountainCar", "InvertedPendulum", "CartPole", "ClassicControl"]\["TD", "batch"]\1\0\0\["Netherlands", "USA"]\["Maastricht University", "Lafayette College"]\[]\126751897\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\\\1\
2900471431\Feature Learning and Transfer Performance Prediction for Video Reinforcement Learning Tasks via a Siamese Convolutional Neural Network\Some("https://link.springer.com/chapter/10.1007/978-3-030-04167-0_32")\["#", "s_i", "s_f", "s", "levels"]\["multi"]\["Q", "fea"]\["j", "tt"]\\["Custom", "CS", "Figures", "Pseudo", "Formulas", "Tables"]\["SARSA", "Q"]\["exp"]\0\1\\["2D", "Navigation", "Games", "VideoGames", "CNN", "Siamese", "Atari", "PacMan", "Maze", "Simulation"]\["TD"]\1\1\1\["China"]\["Nanjing University"]\["Key Laboratory for Novel Software Technology", "Collaborative Innofation of Novel Software Technology and Industrialization"]\0\["all"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
2980820015\Modelling Generalized Forces with Reinforcement Learning for Sim-to-Real Transfer.\Some("https://www.arxiv-vanity.com/papers/1910.09471/")\["t"]\["multi", "sim2real"]\["pi"]\["tr", "tt", "ap"]\\["Custom", "CS", "Formulas", "Figures", "Mujoco", "SciPy", "Tables", "Videos"]\["MPO"]\["N/A"]\0\1\\["Robotics", "Simulation", "RealWorld", "Arm", "Grab"]\["TD"]\1\0\0\["UK"]\["DeepMind"]\["Industry"]\0\["all"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\\2\\
2021066679\Transfer Method for Reinforcement Learning in Same Transition Model -- Quick Approach and Preferential Exploration\Some("https://dblp.uni-trier.de/db/conf/icmla/icmla2011-1.html#TakanoTKT11")\["s_i", "s_f", "s", "levels"]\["multi"]\["rule", "pi"]\["tt"]\\["Custom", "CS", "Formulas", "Pseudo", "Figures", "Tables"]\["Q"]\["N/A"]\0\0\\["2D", "Navigation", "Grid", "Maze", "Simulation"]\["TD"]\1\1\1\["Japan"]\["Mie University"]\["Engineering", "Regional Innovation Studies"]\2062960870\["lib"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\3\
2967582614\Skill Transfer in Deep Reinforcement Learning under Morphological Heterogeneity.\Some("https://arxiv.org/pdf/1908.05265v2")\["v", "a"]\["lit"]\["VAE", "latent"]\["j", "tr", "ap", "tt"]\\["Custom", "CS", "Theorem", "Figures", "Formulas", "Gym", "Tables"]\["PPO", "PVED"]\["N/A"]\0\1\\["2D", "Robotics", "Mujoco", "Hopper", "Bipedal", "Walker2d", "Jaco3", "Simulation"]\["TD"]\1\0\0\["UK"]\["University of Warwick"]\["Warwick Manufacturing Group"]\0\["h"]\["mag"]\["t", "r", "l", "s"]\["t", "r", "l", "s", "k"]\["t", "r", "l", "s", "k"]\\\2\\
2123178778\Transfer of knowledge for a climbing Virtual Human: A reinforcement learning approach\Some("https://ieeexplore.ieee.org/document/5152553/")\["s_i", "s_f"]\["same_all"]\["Q", "fea"]\["j", "tt"]\\["Custom", "CS", "Formulas", "Pseudo", "Tables", "Figures"]\["Dyna-Q"]\["N/A"]\0\0\\["Simulation", "Robotics", "Climbing", "Human", "Joints"]\["TD"]\1\0\0\["France"]\["Université Paris 6", "Laboratoire d’Intégration des Systèmes et des Technologies in Commissariat à l’Énergie Atomique"]\["Intelligent Systems and Robotics", "Industry", "Uni", "Colab"]\0\["h"]\["mag"]\["t", "r", "l", "k"]\["t", "r", "l"]\["t", "r", "l", "k"]\\\\\
3000319261\A Reinforcement Learning Approach for Efficient Opportunistic Vehicle-to-Cloud Data Transfer.\Some("http://arxiv.org/pdf/2001.05321.pdf")\[]\["pure"]\[]\[]\\[]\[]\[]\2\2\\["Vehicle", "Cloud"]\[]\1\0\0\["Germany"]\["TU Dortmund"]\["Communication Networks Institute"]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l"]\\\2\\
2056107474\Efficient Knowledge Transfer in Shaping Reinforcement Learning\Some("http://www.sciencedirect.com/science/article/pii/S1474667016450536")\["t", "r", "s", "a"]\["multi"]\["Q"]\["j", "tt"]\\["Custom", "CS", "Formulas", "Figures", "Tables"]\["Q"]\["N/A"]\0\0\\["2D", "Navigation", "Grid", "Simulation", "Obstacle"]\["TD"]\1\0\0\["Netherlands"]\["Delft University of Technology"]\["Center for Systems and Control"]\0\["h"]\["mag"]\["t", "r", "l", "k"]\["t", "r", "l"]\["t", "r", "l", "k"]\\\\\["Grid", "Obstacles"]
2751021526\Towards Knowledge Transfer in Deep Reinforcement Learning\Some("http://jglobal.jst.go.jp/en/public/20090422/201702272924443642")\["s", "levels", "t"]\["diff-no"]\["pi"]\["j", "tt"]\\["Custom", "CS", "Formulas", "Figures", "Pseudo"]\["DQN"]\["N/A"]\0\1\\["2D", "Games", "VideoGames", "Simulation", "Atari", "Breakout", "Atlantis", "Boxing"]\["TD"]\1\0\0\["Brazil"]\["Escola Politécnica da Universidade de Sao Paulo"]\[]\0\["h"]\["mag", "silva"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\["t", "r", "l", "k", "s"]\\[["Atari", "Atari"]]\\\
3010496493\Transferring Human Manipulation Knowledge to Robots with Inverse Reinforcement Learning\Some("https://vbn.aau.dk/da/publications/transferring-human-manipulation-knowledge-to-robots-with-inverse-")\["t"]\["same_all"]\["I"]\[]\\["Custom", "CS", "Figures", "Formulas"]\["Q"]\[]\0\0\\["Robotics", "Simulation", "RealWorld", "Imitation", "UniversalRobot"]\["TD"]\1\0\0\["Denmark"]\["Aalborg University"]\["Robotics and Automation"]\0\["h"]\["mag"]\["t", "r", "l", "k"]\["t", "r", "l"]\["t", "r", "l", "k", "s"]\\\\\
2945719686\Improving Deep Reinforcement Learning via Transfer\Some("https://dl.acm.org/citation.cfm?id=3332128")\[]\["theory"]\[]\[]\\[]\[]\[]\2\2\\["CNN"]\[]\1\0\0\["USA"]\["Washington State University"]\[]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
2040298429\Serial pattern learning by rats: Transfer of a formally defined stimulus relationship and the significance of nonreinforcement\Some("https://link.springer.com/article/10.3758%2FBF03209273")\[]\[]\[]\[]\\[]\[]\[]\2\2\\["Psychology"]\[]\1\0\0\["USA"]\["The Johns Hopkins University"]\["Psychology"]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l"]\\\\\
3024199149\Efficient Deep Reinforcement Learning via Adaptive Policy Transfer\Some("http://arxiv.org/pdf/2002.08037.pdf")\["s_i", "s_f", "s", "levels", "t"]\["multi"]\["pi", "pi_dyn", "options"]\["j", "tr", "tt", "ap"]\\["Custom", "CS", "fakeOSS", "Figures", "Formulas", "Pseudo"]\["PTF-A3C", "PTF-PPO", "A3C", "PPO"]\["N/A"]\0\1\\["2D", "Navigation", "Grid", "Simulation", "Games", "Pinball", "Reacher", "Robotics"]\["TD", "H"]\1\0\0\["China"]\["Tianjin University", "Huawei", "Nanjing University", "Netease", "JD Digits"]\["Intelligence and Computing", "Noahs Ark Lab", "Machine learning"]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
2910862680\Transfer Value or Policy? A Value-centric Framework Towards Transferrable Continuous Reinforcement Learning\Some("https://openreview.net/pdf?id=H1gZV30qKQ")\["t"]\["same_all"]\["pi"]\["j", "tt", "ap"]\\["Custom", "CS", "Gym", "Mujoco", "baselines", "PyTorch", "Figures", "Formulas", "Pseudo", "Theorem"]\["MVC", "TRPO", "DDPG"]\["N/A"]\0\1\\["2D", "3D", "Robotics", "Model", "ValueCentric", "Mujoco", "HalfCheetah", "InvertedPendulum", "Pendulum", "Reacher"]\["TD", "MB"]\1\0\0\["USA"]\["University of California"]\[]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k", "s"]\\\1\\
2945090640\Options in Multi-task Reinforcement Learning - Transfer via Reflection\Some("https://dblp.uni-trier.de/db/conf/ai/ai2019.html#DenisF19")\["s_i", "s_f"]\["same_all"]\["options"]\["j", "tt", "tr"]\\["Custom", "CS", "Formulas", "Figures", "Tables"]\["DQN", "LOVR"]\["N/A"]\0\1\\["2D", "Navigation", "Grid", "Lifelong", "Landmark", "CNN", "Simulation"]\["TD", "H"]\0\1\1\["Canada"]\["University of Ottawa"]\["Mathematics and Statistics"]\0\["h"]\["mag", "multi"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
2954881914\On mechanisms for transfer using landmark value functions in multi-task lifelong reinforcement learning.\Some("https://export.arxiv.org/pdf/1907.00884")\["s_f"]\["same_all"]\["r", "options"]\["j", "tt"]\\["Custom", "CS", "Formulas", "Figures", "Tables"]\["DQN", "LOVR", "Zombie"]\["N/A"]\0\1\\["2D", "Navigation", "Grid", "Cliff-Walk", "Lifelong", "Landmark", "Simulation"]\["TD", "H"]\1\0\0\["Canada"]\["University of Ottawa"]\["Mathematics and Statistics"]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\\2\\
2443629117\A Verification of Reinforcement Learning with Knowledge Transfer and Knowledge Selection in Multitask Learning\Some("https://www.jstage.jst.go.jp/article/iscie/29/3/29_152/_article/-char/ja/")\[]\[]\[]\[]\\[]\[]\[]\2\2\\["DifferentLanguage"]\[]\1\0\0\["Japan"]\[]\[]\0\[]\["mag"]\[]\[]\[]\\\\\
2994229809\Dynamic pricing of demand response based on elasticity transfer and reinforcement learning\Some("http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=8921683")\["t"]\["same_all"]\["Q"]\["j"]\\["Custom", "CS", "Pseudo", "Formulas", "Figures"]\["SARSA"]\["N/A"]\0\0\\["Power", "ElectricityMarket", "Electricity", "Simulation", "DynamicPricing", "Dynamic", "Demand"]\["TD"]\1\1\1\["China"]\["Tianjin University", "State Grid Tianjin Electric Power", "Beijing Fibrlink Corperation Company"]\["Key Laboratory of Smart Grid", "Industry", "Uni", "Colab"]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
2963248502\Unsupervised Discovery of Decision States for Transfer in Reinforcement Learning.\Some("https://arxiv.org/abs/1907.10580")\["s_i", "s_f", "s"]\["same_all", "multi"]\["options", "sub"]\["j", "tr"]\\["fakeOSS", "Custom", "PyTorch", "A2C-Kostrikov", "Gym", "Tables", "Figures", "Formulas", "Pseudo"]\["A2C", "IR-VIC"]\["N/A"]\0\1\\["2D", "Navigation", "Grid", "Maze", "Room", "MultiRoom", "Simulation", "CNN"]\["TD"]\1\0\0\["USA"]\["Georgia Institute of Technology", "Facebook AI Research"]\["Industry", "Uni", "Colab"]\0\[]\["mag"]\["t", "r", "l"]\[]\["t", "r", "l", "k"]\\\\\
2979305570\A Weight Transfer Mechanism for Kernel Reinforcement Learning Decoding in Brain-Machine Interfaces\Some("https://www.ncbi.nlm.nih.gov/pubmed/31946644")\["a"]\["same_all"]\["WTF"]\["j"]\\["Custom", "CS", "Formulas", "Figures"]\["Softmax"]\["N/A"]\0\0\\["BMI", "KernelRL", "Mice", "Simulation"]\["TD"]\1\1\1\["China"]\["Hong Kong University of Science and Technology"]\["Electronic and Computer Engineering"]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
2786530308\Shaping in reinforcement learning by knowledge transferred from human-demonstrations of a simple similar task\Some("https://dblp.uni-trier.de/db/journals/jifs/jifs34.html#WangFL18")\["t", "v", "a"]\["diff-it"]\["r"]\["tt", "j", "ap", "tr"]\\["Custom", "CS", "Formulas", "Figures", "Tables"]\["Q"]\["sup"]\0\0\\["2D", "3D", "2Dto3D", "MountainCar3D", "Simulation", "MountainCar", "Human", "Demonstration", "ClassicControl"]\["TD"]\1\1\1\["China"]\["Zhejiang University"]\["Aeronatuics and Astronautics", "Schoolf of Control Science and Engineering"]\0\["h"]\["mag"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\["t", "r", "l", "k", "s"]\\\\\
2399136500\Reducing Sample Complexity in Reinforcement Learning by Transferring Transition and Reward Probabilities\Some("http://dx.doi.org/10.5220/0004915606320638")\["s", "levels"]\["same_all", "multi"]\["r", "model", "transitiondynamics"]\["j", "tr", "tt", "ap"]\\["Custom", "CS", "Formulas", "Pseudo", "Theorem", "Tables", "Figures"]\["TR-MAX", "R-MAX", "Q"]\["N/A"]\0\0\\["2D", "Navigation", "Grid", "Maze", "Room", "MultiRoom", "Simulation"]\["TD", "MB"]\1\0\0\["Japan"]\["Tohoku University"]\["Information Sciences"]\0\["all"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
3023198000\GCN-RL Circuit Designer: Transferable Transistor Sizing with Graph Neural Networks and Reinforcement Learning.\Some("http://arxiv.org/pdf/2005.00406.pdf")\[]\["pure"]\[]\[]\\["Custom", "CS", "Pseudo", "Formulas", "Figures", "Tables"]\[]\[]\2\1\\["Circuit Design", "Transistor Sizing"]\["TD", "B"]\1\0\0\["USA"]\["Massachusetts Institute of Technology", "The University of Texas"]\[]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
2889943195\Zero-Shot Transfer with Deictic Object-Oriented Representation in Reinforcement Learning\Some("https://nips.cc/Conferences/2018/Schedule?showEvent=11239")\["#", "s", "levels", "s_i", "s_f"]\["multi"]\["transitiondynamics"]\["j", "tt"]\\["Custom", "CS", "Pseudo", "Formulas", "Theorem", "Figures"]\["DOORMAX", "OO-MDP"]\["N/A"]\0\0\\["2D", "Navigation", "Simulation", "Taxi", "Sokoban"]\["TD", "RRL", "MB"]\1\0\0\["South Africa"]\["University of the Witwaterstrand", "Council for Scientific and Industrial Research"]\["Industry", "Uni", "Colab"]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\\\\
1974126445\Implementation of Reinforcement Learning by transfering sub-goal policies in robot navigation\Some("https://dblp.uni-trier.de/db/conf/siu/siu2013.html#GokceA13")\["s_i", "s_f", "s"]\["same_all"]\["pi"]\["tr"]\\["Custom", "CS", "Figures"]\["Q"]\["N/A"]\0\0\\["2D", "Navigation", "Grid", "Robotics", "Rooms", "Simulation"]\["TD"]\1\0\0\["Turkey"]\["Bogazici University"]\[]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\[]\\\\\["Grid", "Rooms"]
1997664188\Transfer Learning Method Using Ontology for Heterogeneous Multi-agent Reinforcement Learning\Some("https://thesai.org/Publications/ViewPaper?Volume=5&Issue=10&Code=IJACSA&SerialNo=22")\["a", "v", "s_i", "s_f", "s"]\["diff-no", "lit"]\["Q", "federated"]\["j", "tr", "tt", "ap"]\\["Custom", "CS", "Formulas", "Figures", "Tables"]\["Q"]\["exp"]\0\0\\["2D", "Navigation", "Grid", "PredatorPrey", "MultiAgent", "Simulation"]\["TD", "H"]\1\0\0\["Japan"]\["Tokyo Denki University", "National Institute of Advanced Industrial Science and Technology (AIST)"]\["Advanced Science and Technology", "Information and Communication Engineering", "Intelligent Systems Research Institute"]\0\["h"]\["mag", "silva"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\["Grid", "PredatorPrey"]
2972522277\Learning Transferable Domain Priors for Safe Exploration in Reinforcement Learning\Some("https://export.arxiv.org/abs/1909.04307")\["s_i", "s_f", "s", "levels"]\["diff-no"]\["pri"]\["tr"]\\["Custom", "CS", "Formulas", "Pseudo", "Gym", "Figures", "Theorem"]\["Q", "A2C", "SARSA", "DQN"]\["N/A"]\0\1\\["2D", "Navigation", "Grid", "Continuous", "Maze", "Simulation", "SafetyGym"]\["TD"]\1\0\0\["Australia"]\["Deakin University"]\["Applied Artificial Intelligence"]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\2\\
2323475154\A Hybrid Transfer Algorithm for Reinforcement Learning Based on Spectral Method\Some("http://pub.chinasciencejournal.com/article/getArticleRedirect.action?doiCode=10.3724/SP.J.1004.2012.01765")\[]\[]\[]\[]\\["Custom", "CS"]\[]\[]\2\2\\["2D", "Simulation", "Navigation", "Grid", "Spectral", "Simulation"]\[]\0\1\0\["China"]\["University of Mining and Technology"]\["Information and Electrical Engineering"]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\[]\\\\\
2991344719\Attention Privileged Reinforcement Learning for Domain Transfer\Some("https://export.arxiv.org/pdf/1911.08363")\["#", "t", "s", "s_i", "s_f"]\["same_all"]\["pi"]\["tt", "ap", "tr"]\\["Custom", "CS", "TensorFlow", "Formulas", "Pseudo", "Figures", "Web", "Videos"]\["aDDPG", "APRiL"]\["N/A"]\0\1\\["Simulation", "Robotics", "2D", "Navigation", "Arm", "Legs", "Walker2d", "JacoReach", "Simulation"]\["TD"]\1\0\0\["UK"]\["University of Oxford", "DeepMind"]\["Applied AI Lab", "Industry", "Uni", "Colab"]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
2372578164\Transfer of Reinforcement Learning:The State of the Art\Some("http://en.cnki.com.cn/Article_en/CJFDTOTAL-DZXU2008S1006.htm")\[]\["theory"]\[]\[]\\[]\[]\[]\2\2\\["Theory"]\[]\0\1\0\["China"]\["Nanjing University", "State Key Laboratory of Novel Software Technology"]\[]\0\[]\["mag"]\["t", "r" ,"l"]\["t", "r" ,"l", "k"]\[]\\\\\
1178202497\A model-based markovian context-dependent reinforcement learning approach for neurobiologically plausible transfer of experience\Some("https://dblp.uni-trier.de/db/journals/ijhis/ijhis12.html#Hamid15")\[]\[]\[]\[]\\["Custom", "CS"]\[]\[]\2\2\\["ANN", "HIS", "TemporalCredit", "Simulation"]\[]\1\1\1\["Kuwait"]\["Arab Open University"]\["Computer Studies"]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\\\\
143164768\Using Options for Knowledge Transfer in Reinforcement Learning TITLE2\Some("https://dl.acm.org/citation.cfm?id=897568")\["t"]\["multi"]\["options"]\["tt"]\\["Custom", "CS", "Figures", "Formulas"]\["Q" ]\["N/A"]\0\0\\["MountainCar", "Simulation", "ClassicControl"]\["TD"]\1\0\0\["USA"]\["University of Massachusetts"]\[]\1510402218\["all"]\["taylor", "mag", "lazaric"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l"]\\[]\\3\
2611612482\Bacteria Foraging Reinforcement Learning for Risk-Based Economic Dispatch via Knowledge Transfer\Some("https://ideas.repec.org/a/gam/jeners/v10y2017i5p638-d97735.html")\["s"]\["same_all"]\["Q"]\["j"]\\["Custom", "CS", "Formulas", "Figures", "Tables"]\["BFRL"]\[]\0\0\\["Power", "Efficiency", "SmartEnergy", "Effectiveness", "Simulation", "IEEE RTS-19", "Bacteria", "MultiAgent"]\["TD"]\1\0\0\["China"]\["South China University of Technology", "Kunming University of Science and Technology"]\["Electric Power Engineering"]\0\[]\["mag"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
3013348603\Parallel Knowledge Transfer in Multi-Agent Reinforcement Learning.\Some("http://arxiv.org/pdf/2003.13085.pdf")\["s_i", "s_f"]\["multi"]\["advisor"]\["tr", "tt", "ap"]\\["Custom", "CS", "Formulas", "Figures", "Tables"]\["RL", "DQN", "DDPG"]\["N/A"]\0\1\\["2D", "Navigation", "Simulation", "Grid", "Collect", "Cooperation", "MultiAgent"]\["TD"]\1\0\0\["USA", "China"]\["Carnegie Mellon University", "Sun Yat-sen University"]\["Robotics", "Mathematics"]\0\[]\["mag", "multi"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\2\\
1696410204\Experiments with Adaptive Transfer Rate in Reinforcement Learning\Some("https://rd.springer.com/chapter/10.1007/978-3-642-01715-5_1")\["s_i", "s_f"]\["same_all"]\["pi"]\["j", "tr", "tt"]\\["Custom", "CS", "Pseudo", "Formulas", "Figures", "Tables"]\["Q", "PPR", "MGE"]\["N/A"]\0\0\\["2D", "Navigation", "Simulation", "Grid"]\["TD", "H"]\1\1\1\["France"]\["Université Paris-Dauphine", "Université Paris 6", "Centre IRD de l Ile de France"]\[]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\["Grid"]
2896823681\Episodic memory transfer for multi-task reinforcement learning\Some("https://www.sciencedirect.com/science/article/abs/pii/S2212683X18300902")\["v"]\["multi"]\["LSTM"]\["ap"]\\["Custom", "CS", "Formulas"]\["SEM-PAAC", "PAAC", "A3C-LSTM"]\["N/A"]\0\1\\["2D", "Navigation", "Taxi", "Simulation", "Multitask", "Memory", "EpisodicMemory"]\["TD", "H"]\1\1\1\["Russia"]\[]\[]\0\["lib"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "s"]\["t", "r", "l", "k", "s"]\\\\\
2984781810\Unsupervised Reinforcement Learning of Transferable Meta-Skills for Embodied Navigation\Some("https://export.arxiv.org/pdf/1911.07450")\["s_i", "s_f", "s", "levels"]\["multi"]\["Curriculum"]\["tt", "tr", "ap"]\\["Custom", "CS", "Formulas", "Pseudo", "Figures", "Tables"]\["A3C", "LSTM"]\["N/A"]\0\1\\["3D", "Navigation", "Simulation", "Hypernetwork", "AI2-THOR"]\["TD", "H", "MB"]\1\0\0\["China", "USA"]\["Zheijang University", "University of California"]\[]\0\[]\["mag"]\["t", "r", "l", "s"]\["t", "r", "l", "s"]\["t", "r", "l", "s", "k"]\\\2\\
3027086341\Human Instruction-Following with Deep Reinforcement Learning via Transfer-Learning from Text\Some("http://arxiv.org/pdf/2005.09382.pdf")\[]\["same_all"]\["WT"]\["ap"]\\["Custom", "CS", "BERT", "TensorFlow", "Figures", "Tables"]\["BERT"]\["N/A"]\0\1\\["NLP", "TextInstructions", "3D", "Unity", "Simulation"]\["TD"]\1\0\0\["USA"]\["DeepMind"]\["Industry"]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\\2\\
2962744619\An improved reinforcement learning algorithm based on knowledge transfer and applications in autonomous vehicles\Some("https://dblp.uni-trier.de/db/journals/ijon/ijon361.html#DingDWH19")\["v", "t"]\["lit"]\["cases", "Q", "pi_gen"]\["j"]\\["Custom", "CS", "Formulas", "Pseudo", "Figures", "Tables"]\["Q", "KT-HA-Q"]\["sup"]\0\0\\["2D", "ClassicControl", "Simulation", "2Dto3D", "3D", "MountainCar", "MountainCar3D"]\["TD", "CBR"]\1\1\1\["China"]\["University of Shanghai", "Northeast Petroleum University"]\["Control Science and Engineering", "Complex Systems and Advanced Control"]\0\["h"]\["mag"]\["t", "r", "l", "k"]\["t", "r", "l", "k", "s"]\["t", "r", "l", "k", "s"]\\\\\
2901461022\Self-organizing maps for storage and transfer of knowledge in reinforcement learning:\Some("https://doi.org/10.1177/1059712318818568")\["s_i", "s_f"]\["same_all"]\["Q"]\["tr"]\\["Custom", "CS", "Pseudo", "Formulas", "Figures"]\["SOM", "Q"]\["SOM"]\0\0\\["SOF", "SelfOrganizingMap", "2D", "Navigation", "Simulation"]\["TD"]\1\0\0\["Singapore"]\["Singapore University of Technology and Design"]\["Computer Science"]\0\["lib"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
3003281790\Multiperspective Light Field Reconstruction Method via Transfer Reinforcement Learning.\Some("http://downloads.hindawi.com/journals/cin/2020/8989752.pdf")\["s"]\["lit"]\["pi"]\["ap"]\\["Custom", "CS", "Formulas", "Pseudo", "TensorFlow", "Tables"]\["Q"]\["exp"]\0\1\\["Simulation", "LightFieldReconstruction", "PCA", "Q", "Image", "ImageRecognition", "CNN", "Vehicle", "MultiAgent"]\["TD"]\1\0\0\["China"]\["Henan Institue of Science and Technology", "Shandong University"]\["Artifical Intelligence", "Information Engineering", "Control Science and Engineering"]\0\["all"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l"]\\\\\
2965999293\Training a RoboCup Striker Agent via Transferred Reinforcement Learning.\Some("https://link.springer.com/chapter/10.1007%2F978-3-030-27544-0_9")\["s_i", "s_f"]\["same_all"]\["pi"]\["tt"]\\["Custom", "CS", "Tables", "Figures", "Formulas"]\["DDPG"]\["N/A"]\0\1\\["Simulation", "Striker", "RoboCup", "Curriculum", "CNN", "MultiAgent", "Soccer"]\["TD", "MB"]\1\1\1\["USA"]\["Colorado School of Mines"]\["Computer Science"]\0\["all"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k", "s"]\\\\\
2062960870\Preferential exploration method of transfer learning for reinforcement learning in Same Transition Model\Some("https://dblp.uni-trier.de/db/conf/scisisis/scisisis2012.html#TakanoTKT12")\["s_i", "s_f", "s", "levels"]\["multi"]\["rule", "pi"]\["tt"]\\["Custom", "CS", "Formulas", "Pseudo", "Tables", "Figures"]\["Q"]\["N/A"]\0\0\\["2D", "Navigation", "Grid", "Maze", "Simulation"]\["TD"]\1\1\1\["Japan"]\["Mie University"]\["Engineering", "Regional Innovation Studies"]\0\["lib"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\["Grid", "Rooms"]
2834852173\Common Subspace Transfer for Reinforcement Learning Tasks\Some("https://cris.maastrichtuniversity.nl/en/publications/common-subspace-transfer-for-reinforcement-learning-tasks")\["v", "a", "#", "t"]\["lit"]\["fea", "sub"]\["j", "tt"]\\["Custom", "CS", "Formulas", "Pseudo", "Figures", "Tables"]\["FVI"]\["exp"]\1\0\\["2D", "CartPole", "Swingup", "MassSystem", "Simulation", "ClassicControl"]\["TD"]\1\1\1\["Netherlands", "USA"]\["Maastricht University", "Lafayette College"]\[]\1481405077\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\[["SingleMass", "DoubleMass"], ["Pendulum", "CartpoleSwingup"]]\\3\
2902788990\Cache-Enabled Adaptive Bit Rate Streaming via Deep Self-Transfer Reinforcement Learning\Some("https://dblp.uni-trier.de/db/conf/wcsp/wcsp2018.html#ZhangZLHY18")\[]\["same_all"]\["r", "imitation"]\["tr"]\\["Custom", "CS", "Formulas", "Pseudo", "Figures", "Tables"]\["DQN", "NAF"]\["N/A"]\0\1\\["Network", "Wireless", "Simulation", "Adaptive", "BitRate", "Streaming", "SelfTransfer", "CacheEnabled"]\["TD"]\1\1\1\["China"]\["Southeast University"]\["Information Science and Engineering"]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k", "s"]\\\\\
2906852733\Reusing Source Task Knowledge via Transfer Approximator in Reinforcement Transfer Learning\Some("https://www.mdpi.com/2073-8994/11/1/25/pdf")\["s", "s_i", "s_f", "t"]\["diff-it", "lit"]\["fea", "Q"]\["j", "tt"]\\["Custom", "CS", "Pseudo", "Formulas", "Figures", "Tables"]\["TL-ANNA", "PPR","Q"]\["exp"]\1\1\\["2D", "Navigation", "RoboCup", "Simulation", "KeepAway", "ProbabilisticPolicyReuse", "MultiAgent", "3v2", "4v3"]\["TD"]\1\0\0\["China"]\["National University of Defense Technology"]\["Intelligence Science and Technology"]\0\["lib"]\["mag"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\["t", "r", "l", "k", "s"]\\\\\
2166502700\Intelligent call transfer based on reinforcement learning\Some("https://dblp.uni-trier.de/db/conf/ijcnn/ijcnn2000-6.html#JevticS00")\[]\["pure"]\[]\[]\\["Custom", "CS", "Pseudo", "Formulas", "Figures", "Tables"]\[]\[]\2\0\\["Call Transfer", "ICT", "Simulation"]\["TD"]\0\1\1\["Croatia"]\["University of Zagreb"]\[]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l"]\\\\\
2949169393\Continual Reinforcement Learning deployed in Real-life using Policy Distillation and Sim2Real Transfer\Some("https://openreview.net/pdf?id=BklR5pHhjV")\["t", "s_i", "s_f", "s", "levels"]\["sim2real"]\["distil", "pi_dyn", "SRL"]\["tr"]\\["OSS", "stable-baselines2", "PyBullet", "numpy", "Figures", "Formulas", "Videos"]\["PPO2"]\["N/A"]\0\1\\["2D", "Navigation", "Robotics", "Simulation", "RealWorld"]\["TD"]\1\0\0\["France"]\["Flowers Laboratory", "Softbank Robotics Europe", "Thales"]\["AI Lab", "Theresis Lab"]\0\["lib"]\["mag"]\["t", "r", "l"]\["r", "l"]\["t", "r", "l", "k"]\\\\\
2973019294\Transferring Human Manipulation Knowledge to Industrial Robots Using Reinforcement Learning\Some("http://www.sciencedirect.com/science/article/pii/S2351978920301372")\["t"]\["same_all"]\["I"]\[]\\["Custom", "CS", "Gym", "Figures", "Formulas", "Tables", "Pseudo"]\["Q"]\[]\0\0\\["Robotics", "Simulation", "RealWorld", "Imitation", "Gazebo"]\["TD"]\1\0\0\["Spain", "Denmark"]\["Mondragon University", "Aalborg University"]\["Robotics and Automation"]\0\["h"]\["mag"]\["t", "r", "l", "k"]\["t", "r", "l"]\["t", "r", "l", "k", "s"]\\\\\
2892267807\VPE: Variational Policy Embedding for Transfer Reinforcement Learning\Some("http://export.arxiv.org/pdf/1809.03548")\["t"]\["sim2real"]\["Q", "pi_gen"]\["tr"]\\["Custom", "CS", "Gym", "Figures", "Formulas", "Videos"]\["DQN"]\[]\0\1\\["Robotics", "Simulation", "RealWorld", "MasterQ", "Pendulum", "ClassicControl"]\["TD", "B"]\1\0\0\["Sweden"]\["Royal Institute of Technology", "Örebro University"]\["Robotics, Perception and Learning Lab", "Applied Autonomous Sensor Systems"]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
2794577523\Cache-Enabled Dynamic Rate Allocation via Deep Self-Transfer Reinforcement Learning.\Some("https://arxiv.org/pdf/1803.11334")\[]\["same_all"]\["r", "imitation"]\["tr"]\\["Custom", "CS", "Formulas", "Pseudo", "Tables", "Figures"]\["DQN", "NAF"]\["N/A"]\0\1\\["Network", "Wireless", "Simulation", "Adaptive", "BitRate", "Streaming", "SelfTransfer", "CacheEnabled"]\["TD"]\1\1\1\["China"]\["Southeast University"]\["Information Science and Engineering"]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k", "s"]\\\2\\
1585019787\Grounding Hierarchical Reinforcement Learning Models for Knowledge Transfer\Some("https://128.84.21.199/abs/1412.6451?context=cs.RO")\["s_f", "levels"]\["multi"]\["I"]\["ap", "tr"]\\["Custom", "CS", "Formulas", "Figures"]\["Q", "QP"]\["N/A"]\0\0\\["2D", "Navigation", "Grid", "Maze", "Simulation"]\["TD", "H", "MB"]\1\0\0\["Germany"]\["Otto-Friedrich-Universität Bamberg"]\["Information Systems and Applied Computer Sciences"]\0\["h"]\["mag"]\["t", "r", "l", "k"]\["r", "l"]\["t", "r", "l", "k"]\\\\\
126751897\Automated transfer in reinforcement learning\Some("https://cris.maastrichtuniversity.nl/portal/en/publications/automated-transfer-in-reinforcement-learning(0a610cee-18d6-4ffe-a803-e6153fc7dc61).html")\["t", "a", "v"]\["diff-no", "lit"]\["Q", "I"]\["tt", "tr"]\\["Custom", "CS", "Formulas", "Figures", "Pseudo", "Tables"]\["LSPI", "FQI"]\["exp"]\0\0\\["2D", "Simulation", "MountainCar", "InvertedPendulum", "CartPole", "ClassicControl"]\["TD", "H", "B", "MB", "batch"]\1\0\0\["Netherlands", "USA"]\["Maastricht University", "Lafayette College"]\["Computer Science"]\1\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\[["InvertedPendulum", "MountainCar"], ["MountainCar", "CartPole"], ["InvertedPendulum", "CartPole"]]\\\
2952460221\Transfer Reinforcement Learning based Framework for Energy Savings in Cellular Base Station Network\Some("https://ieeexplore.ieee.org/document/8738418")\["s_i"]\["same_all"]\["pi"]\["ap", "tr"]\\["Custom", "CS", "Formulas", "Figures", "Tables"]\["Actor-Critic"]\["N/A"]\0\0\\["Energy Saving", "Network", "WiFi", "RealWorld", "Simulation"]\["TD"]\1\0\0\["India"]\["Indraprastha Institute of Information Technology"]\["Electronics and Communication Engineering"]\2604038904\["h"]\["mag"]\["t", "r", "l"]\["r", "l"]\["t", "r", "l"]\\[]\\1\
3022965940\5G UDN Proactive Handover with Transferable Reinforcement Learning-based Trajectory Prediction\Some("https://boris.unibe.ch/134014/")\[]\[]\[]\[]\\["Custom", "CS"]\[]\[]\2\2\\["5G", "ICT", "Handover"]\[]\0\1\0\[]\[]\[]\0\[]\["mag"]\["t", "r", "l"]\[]\[]\\\\\
2997386021\Bayesian Inverse Reinforcement Learning for Demonstrations of an Expert in Multiple Dynamics: Toward Estimation of Transferable Reward\Some("https://www.jstage.jst.go.jp/article/tjsai/35/1/35_G-J73/_pdf")\[]\[]\["r"]\[]\\["Custom", "CS", "Formulas", "Pseudo"]\["Q", "BIRL"]\[]\0\2\\["2D", "Navigation", "Grid", "Simulation"]\["TD"]\1\0\0\["Japan"]\["Chiba University"]\["Urban Environment Systems"]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\[]\\\\\
2989943922\Helping an Agent Reach a Different Goal by Action Transfer in Reinforcement Learning\Some("http://doi.org/10.1007/978-3-030-35288-2_2")\["s_f"]\["multi"]\["advisor"]\["j", "tt"]\\["Custom", "CS", "Formulas", "Theorem", "Pseudo", "Tables"]\["Q"]\["N/A"]\0\0\\["2D", "Navigation", "Grid", "Simulation", "PolicySimilarity"]\["TD", "H", "RRL"]\1\1\1\["Australia"]\["University of Wollongong"]\["Computing and Information Technology"]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\\\\
2100493795\Online collaborative multi-agent reinforcement learning by transfer of abstract trajectories\Some("http://dare.uva.nl/document/2/64097")\["#", "levels", "s_i", "s_f"]\["multi"]\["fea", "AbstractTrajectories"]\["tt"]\\["Custom", "CS", "Multiquest", "Figures", "Pseudo", "Tables"]\["Q"]\["N/A"]\0\0\\["2D", "Navigation", "Grid", "Maze", "Pickup", "Subgoals", "MultiAgent", "Simulation"]\["TD", "H"]\1\0\0\["Netherlands"]\["University of Amsterdam"]\["Computer Science"]\0\["h"]\["mag"]\["t", "r", "l"]\["r", "l"]\["t", "r", "l", "k", "s"]\\\\\["Grid", "Maze", "Obstacles"]
2956458483\Global Maximum Power Point Tracking of PV Systems under Partial Shading Condition: A Transfer Reinforcement Learning Approach\Some("https://www.mdpi.com/2076-3417/9/13/2769/pdf")\["t", "s_i"]\["same_all"]\["KM", "Q"]\["j", "tr"]\\["Custom", "CS", "Formulas", "Figures", "Tables"]\["Q"]\["N/A"]\0\0\\["PV", "Power", "Electric", "Efficiency", "SmartEnergy", "MPPT", "Simulation"]\["TD"]\1\0\0\["China"]\["Suzhou Power Supply Company", "Kunming University of Science and Technology", "Shantou University"]\["Industry", "Uni", "Colab"]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
2997891625\Transfer Reinforcement Learning using Output-Gated Working Memory\Some("https://aaai.org/Conferences/AAAI-20/wp-content/uploads/2020/01/AAAI-20-Accepted-Paper-List.pdf#4442")\["s_i", "s_f"]\["multi"]\["output-gated-memory"]\["j", "tt"]\\["OSS", "Github", "pandas", "Formulas", "Figures", "Tables"]\["DQN", "HRR"]\["N/A"]\0\1\\["2D", "Navigation", "Grid", "HRR", "Gates", "Simulation"]\["TD"]\1\0\0\["USA"]\["Middle Tennessee State University"]\["Computer Science"]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
2987047801\A Deep Reinforcement Learning based Approach to Learning Transferable Proof Guidance Strategies.\Some("https://arxiv.org/abs/1911.02065")\["s", "t"]\["lit"]\["pi"]\["j", "ap"]\\["Custom", "CS", "deepmath", "SciKit-Optimize", "TPTP", "Formulas", "Figures", "Tables"]\["TRAIL"]\["N/A"]\1\1\\["Simulation", "FOL", "FirstOrderLogic", "TheoremProving", "Theorem", "TRAIL", "Proof", "Guidance"]\["TD"]\1\0\0\["USA"]\["Northwestern University","University of Illinois", "MIT-IBM AI Lab", "IBM Research", "The University of Auckland"]\["Uni", "Industry", "Colab"]\0\["h"]\["mag"]\["t", "r", "l"]\["r", "l"]\["t", "r", "l", "k"]\\\2\\
232085181\Transferring Models in Hybrid Reinforcement Learning Agents\Some("https://link.springer.com/content/pdf/10.1007%2F978-3-642-23957-1_19.pdf")\["v", "t"]\["diff-it"]\["model", "r"]\["tt", "tr"]\\["Custom", "CS", "Formulas", "Pseudo", "Figures", "Tables", "RL-Glue"]\["TiMRLA", "Q", "SARSA"]\["sup"]\0\0\\["2D", "3D", "2Dto3D", "MountainCar", "MountainCar3D", "Simulation", "ClassicControl"]\["TD", "MB"]\1\0\0\["Greece"]\["Aristotle University of Thessaloniki"]\["Informatics"]\3002235425\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\6\
197857547\Qualitative Transfer for Reinforcement Learning with Continuous State and Action Spaces\Some("https://rd.springer.com/chapter/10.1007/978-3-642-41822-8_25")\["t"]\["same_all"]\["model", "pi"]\["tt"]\\["Custom", "CS", "Formulas", "Pseudo", "Figures"]\["PILCO", "QTL-PILCO"]\["N/A"]\0\0\\["InvertedPendulum", "Simulation", "GuassianProcess", "ClassicControl"]\["TD", "PS", "MB"]\1\0\0\["Mexico"]\["Instituto Nacional de Astrófisica"]\["Optics and Electronics"]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
2949344914\Injecting Prior Knowledge for Transfer Learning into Reinforcement Learning Algorithms using Logic Tensor Networks.\Some("http://arxiv.org/pdf/1906.06576.pdf")\["s_i", "s_f", "#"]\["multi"]\["pri"]\["j", "tt"]\\["Custom", "CS", "Formulas", "Figures"]\["DDQN", "DQN", "LTS"]\["N/A"]\0\1\\["2D", "Simulation", "Navigation", "Grid"]\["TD"]\1\0\0\["Belgium", "USA"]\["Université Libre de Bruxelles", "Sony Computer Science Laboratories"]\["Industry", "Uni", "Colab"]\0\["all"]\["mag"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\2\\
2990376820\Self-Attentional Credit Assignment for Transfer in Reinforcement Learning\Some("https://openreview.net/pdf?id=B1xybgSKwB")\["v", "s_i", "s_f", "t", "r"]\["multi"]\["r", "credit"]\["tt"]\\["Custom", "CS", "Formulas", "Figures"]\["Q", "PPO", "DQN"]\["N/A"]\0\1\\["2D", "Navigation", "Grid", "DMLab", "Credit", "Simulation"]\["TD"]\1\0\0\["USA"]\["Google Research"]\["Industry"]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k", "s"]\\\1\\
3008492644\Real–Sim–Real Transfer for Real-World Robot Control Policy Learning with Deep Reinforcement Learning\Some("https://www.mdpi.com/2076-3417/10/5/1555/pdf")\["t", "s_i", "s_f"]\["multi"]\["RSR", "model"]\["tt"]\\["Custom", "CS", "Gazebo", "TensorFlow", "Gym", "Pseudo", "Formulas", "Figures", "Tables"]\["PPO"]\["N/A"]\0\1\\["Robotics", "Simulation", "RealWorld", "Gazebo", "Manipulation", "Navigation", "3D"]\["TD"]\1\0\0\["China"]\["Chinese Academy of Sciences", "State Key Laboratory of Management and Control for Complex Systems"]\["Institute of Automation"]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "s"]\["t", "r", "l", "s"]\\\\\
1803475911\Hierarchical Functional Concepts for Knowledge Transfer among Reinforcement Learning Agents\Some("http://ijfs.usb.ac.ir/article_2113.html")\["s_i", "s_f", "levels"]\["multi"]\["Q", "fea"]\["j", "tt"]\\["Custom", "CS", "Formulas", "Figures", "Tables"]\["Q"]\["N/A"]\0\0\\["2D", "Navigation", "Grid", "Taxi", "Simulation"]\["TD", "H"]\1\0\0\["Iran"]\["University of Tehran"]\["Electrical and Computer Engineering"]\0\["h"]\["mag"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\["Grid", "Taxi", "Obstacles"]
2980248684\Knowledge Transfer in Reinforcement Learning Agent\Some("http://xplorestaging.ieee.org/ielx7/8850849/8860868/08860881.pdf?arnumber=8860881")\["s_i", "s_f"]\["multi"]\["Q", "pi"]\["j"]\\["Custom", "CS", "Formulas", "Figures"]\["Q"]\["N/A"]\0\0\\["2D", "Navigation", "Simulation", "Grid"]\["TD", "CBR"]\1\0\0\["Bulgaria"]\["Bulgarian Academy of Sciences"]\["Institute of Robotics"]\0\["h"]\["mag"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
3026075272\Transferable Cost-Aware Security Policy Implementation for Malware Detection Using Deep Reinforcement Learning.\None\["t"]\["same_all"]\["WT"]\["j", "tr", "tt"]\\["Custom", "CS", "Gym", "TensorFlow", "Keras", "ChainerRL", "Figures", "Formulas", "Tables"]\["ACER"]\[]\0\1\\["Malware", "Detection", "Simulation"]\["TD"]\1\0\0\["Israel"]\["Ben-Gurion University of the Negev"]\["Software and Information Systems Engineering"]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\\2\\
2891191051\Transferring Deep Reinforcement Learning with Adversarial Objective and Augmentation\Some("https://ui.adsabs.harvard.edu/abs/2018arXiv180900770H/abstract")\["#", "v", "t"]\["lit"]\["GAN"]\["tr", "tt"]\\["Custom", "CS", "Gym", "ALE", "Formulas", "Figures", "Tables"]\["DQN"]\["N/A"]\0\1\\["2D", "Games", "VideoGames", "Atari", "Pong", "Simulation"]\["TD"]\1\0\0\["Taiwan"]\["National Taiwan University"]\["Computer Science and Information Engineering"]\0\["h"]\["mag"]\["t", "r", "l"]\["r", "l", "k", "s"]\["t", "r", "l", "k"]\\\2\\
2167778886\Transferring experience in reinforcement learning through task decomposition\Some("https://dl.acm.org/citation.cfm?id=1558109.1558208")\["#"]\["same_all"]\["rule", "advisor"]\["tt"]\\["Custom", "CS", "Formulas", "Figures", "Pseudo", "Tables"]\["Q"]\["sup"]\0\0\\["2D", "RoboCup", "KeepAway", "Simulation", "MultiAgent", "4v3", "3v2", "5v4", "6v5"]\["TD"]\1\0\0\["Greece"]\["Aristotle University of Thessaloniki"]\["Informatics"]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
3008744627\Efficient Deep Reinforcement Learning through Policy Transfer\Some("https://www.arxiv-vanity.com/papers/2002.08037/")\["s_i", "s_f", "s", "levels", "t"]\["multi"]\["options"]\["j", "tr", "tt", "ap"]\\["Custom", "CS", "fakeOSS", "Figures", "Formulas", "Pseudo"]\["PTF-A3C", "PTF-PPO", "A3C", "PPO"]\["N/A"]\0\1\\["2D", "Navigation", "Grid", "Simulation", "Games", "Pinball", "Reacher", "Robotics"]\["TD"]\1\0\0\["China"]\["Tianjin University", "Huawei", "Nanjing University", "Netease", "JD Digits"]\["Intelligence and Computing", "Noahs Ark Lab", "Machine learning"]\3024199149\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\7\
3022749764\Enhancing Transferability of Deep Reinforcement Learning-Based Variable Speed Limitendgraf Control Using Transfer Learning\Some("http://xplorestaging.ieee.org/ielx7/6979/4358928/09090297.pdf?arnumber=9090297")\["t"]\["same_all"]\["nn_weight_links"]\["tt"]\\["Custom", "CS", "Formulas"]\["VSL", "DDQN"]\["N/A"]\0\1\\["Simulation", "Traffic", "SmartTraffic", "SpeedLimit", "Adaptive", "Vehicle"]\["TD"]\1\1\1\["China", "Australia"]\["Southeast University", "University of Tasmania"]\["School of Transportation", "Information and Communication Technology"]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
2893105945\Bayesian Transfer Reinforcement Learning with Prior Knowledge Rules.\Some("https://export.arxiv.org/pdf/1810.00468")\["s"]\["same_all"]\["pri"]\["j", "tt"]\\["Custom", "CS", "Formulas", "Gym", "Figures"]\["Q"]\["N/A"]\0\0\\["2D", "Navigation", "Grid", "Maze", "Simulation"]\["TD", "B"]\1\0\0\["Greece"]\["Athens University"]\["Economics and Business"]\0\[]\["mag"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\2\\
2989941044\Transfer Reinforcement Learning across Environment Dynamics with Multiple Advisors.\Some("http://ceur-ws.org/Vol-2491/paper11.pdf")\["t"]\["multi"]\["advisor"]\["j", "tr"]\\["Custom", "CS", "Formulas", "Figures"]\["BDPI", "ABDPI", "DQN"]\["N/A"]\0\1\\["2D", "Navigation", "Grid", "Rooms", "Simulation"]\["TD"]\1\0\0\["Belgium"]\["Vrije Universiteit Brussel"]\["Computer Science"]\0\["all"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k", "s"]\\\\\
1604959332\Relational transfer across reinforcement learning tasks via abstract policies.\Some("http://www.teses.usp.br/teses/disponiveis/3/3141/tde-04112014-103827/pt-br.php")\["s_i", "s_f", "levels"]\["multi"]\["Q", "pi_dyn"]\["tt", "ap"]\\["OSS", "Custom", "Figures", "Pseudo", "Formulas", "Tables"]\["SARSA", "Q", "AbsSarsa", "AbsProb-RL", "S2L-RL"]\["N/A"]\0\0\\["2D", "Navigation", "Grid", "Simulation", "MonteCarlo"]\["RRL", "CBR", "H", "TD", "MB"]\1\0\0\["Brazil"]\["University of Sao Paulo"]\["Computer Engineering"]\2\["all", "lib"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k", "s"]\\\\\["Grid"]
3007833995\Multi-AUV Collaborative Target Recognition Based on Transfer-Reinforcement Learning\Some("https://dblp.uni-trier.de/db/journals/access/access8.html#CaiSXMC20")\["s"]\["same_all"]\["fea", "Q"]\["ap", "tr"]\\["Custom", "CS", "Formulas", "Figures", "Pseudo", "Tables"]\["Q"]\["N/A"]\0\1\\["Underwater", "Image", "Feature", "Extraction", "Target", "Recognition", "AutonomousUnderwaterVehicle", "AUV", "CNN", "Vehicle", "Simulation"]\["TD", "B"]\1\0\0\["China"]\["Henan Institute of Science and Technology", "Shandong University"]\["Artifical Intelligence", "Information Engineering", "Control Science"]\0\["all"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\\\\
2573943923\Transfer of Reinforcement Learning Negotiation Policies: from Bilateral to Multilateral Scenarios\Some("https://researchportal.hw.ac.uk/en/publications/transfer-of-reinforcement-learning-negotiation-policies-from-bila")\["t", "a"]\["multi"]\["pi"]\["tr"]\\["Custom", "CS"]\[]\[]\0\2\\["Simulation", "Trading", "Settlers"]\["TD", "B"]\1\0\0\["UK"]\["Herriot-Watt University"]\["Interaction Lab"]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l"]\\\\\
2804848719\Reinforcement-Learning-Based Personalization of Head-Related Transfer Functions\Some("http://www.aes.org/e-lib/browse.cfm?elib=19563")\[]\["pure"]\[]\[]\\["Custom", "CS", "Formulas", "Figures", "Tables"]\[]\[]\2\2\\["HRTF", "HeadRelatedTransfer", "Personalization", "Simulation"]\["TD"]\1\0\0\["Japan"]\["Nagoaka University of Technology", "National Institute of Technology"]\[]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l"]\\\\\
2617832762\State Space Decomposition and Subgoal Creation for Transfer in Deep Reinforcement Learning\Some("https://export.arxiv.org/pdf/1705.08997")\["s_f"]\["same_all"]\["sub"]\["tr"]\\["Custom", "CS", "Figures", "Formulas"]\["PG", "LSTM"]\["exp"]\0\1\\["Hierarchical", "MetaController", "Attention", "Decomposition", "SubgoalCreation", "2D", "Simulation", "Grid", "Navigation"]\["TD", "H"]\1\0\0\["USA"]\["Georgia Institute of Technology"]\["College of Computing"]\0\["h"]\["mag"]\["t", "r", "l"]\["r", "l"]\["t", "r", "l", "k"]\\\\\
2395918031\FCM-Type Co-clustering Transfer Reinforcement Learning for Non-Markov Processes\Some("https://dblp.uni-trier.de/db/conf/iukm/iukm2015.html#NotsuUHUH15")\["t"]\["same_all"]\["FCCM"]\["j", "tt"]\\["Custom", "CS", "Formulas", "Figures", "Pseudo"]\["Q", "Q-FCCM"]\["N/A"]\0\0\\["2D", "ClassicControl", "Pendulum", "Simulation"]\["TD"]\1\1\1\["Japan"]\["Osaka Prefecture University"]\["Engineering"]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\\\\
2617354312\A survey of transfer learning methods for reinforcement learning\Some("https://cedar.wwu.edu/computerscience_stupubs/5/")\[]\["theory", "survey"]\[]\[]\\["Pseudo"]\[]\[]\2\0\\["theory", "survey"]\[]\1\0\0\["USA"]\["Western Washington University"]\["Computer Science"]\2\[]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k", "s"]\\\\\
3004450474\Adapt-to-Learn: Policy Transfer in Reinforcement Learning\Some("https://openreview.net/pdf?id=ryeT10VKDH")\["t"]\["same_all"]\["r", "manifold"]\["j", "tr"]\\["Custom", "CS", "Formulas", "Pseudo", "Theorem", "Tables", "Lemma"]\["PPO", "ATL"]\["exp"]\0\1\\["Simulation", "Robotics", "AdaptToLearn", "Vehicle", "Aerial", "Drone"]\["TD", "B", "MB"]\1\0\0\["USA"]\["University of Illinois"]\[]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l", "s"]\["t", "r", "l", "s"]\\\1\\
199090187\Transfer between Different Reinforcement Learning Methods\Some("https://rd.springer.com/chapter/10.1007/978-3-642-01882-4_7")\[]\["theory", "bookchapter"]\[]\[]\\[]\[]\[]\2\2\\["Theory", "Bookchapter"]\[]\0\1\0\[]\[]\[]\0\[]\["mag"]\[]\[]\[]\\\\\
166919721\Bayesian methods for knowledge transfer and policy search in reinforcement learning\Some("https://ir.library.oregonstate.edu/downloads/1544bs366")\["s_i", "s_f", "levels", "#"]\["multi"]\["advisor", "pri"]\["tt", "tr", "ap"]\\["Custom", "CS", "Formulas", "Figures", "Tables", "Pseudo"]\["MBOA", "DynaQ", "BOA", "LSPI", "Q", "CMAC", "BPS"]\["N/A"]\0\0\\["Simulation", "2D", "Navigation", "Cost", "Wargus", "MountainCar", "Acrobot", "CartPole", "Bicycle", "3LinkPlanarArm", "MultiAgent", "MonteCarlo", "ClassicControl", "PredatorPrey"]\["H", "TD", "PS", "B", "MB", "batch"]\1\0\0\["USA"]\["Oregon State University"]\["Computer Science"]\1\["all"]\["mag"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\["Wargus"]
3006434787\Transferring knowledge as heuristics in reinforcement learning\Some("https://dl.acm.org/doi/10.1016/j.artint.2015.05.008")\["a", "v"]\["diff-it"]\["Q"]\["j"]\\["Custom", "OSS", "Formulas", "Figures", "Tables", "Pseudo"]\["HA-SARSA", "L3-SARSA", "HAQL", "CB-HAQL"]\["sup"]\0\0\\["2D", "3D", "2Dto3D", "MountainCar", "MountainCar3D", "Simulation", "CaseBased", "ClassicControl"]\["CBR"]\1\0\0\["Brazil", "Spain"]\["Centro Universitário da FE", "Universidade Federal do ABC", "Universitat Autonoma de Barcelona"]\["Artifical Intelligence Research Institute"]\0\[]\["mag", "silva"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\[["MountainCar", "MountainCar3D"], ["Acrobot", "RobocubJoints"]]\\\
2796814993\Reinforcement Learning Transfer Using a Sparse-Coded Inter-Task Mapping (extended abstract)\Some("https://cris.maastrichtuniversity.nl/portal/en/publications/reinforcement-learning-transfer-using-a-sparsecoded-intertask-mapping-extended-abstract(e5e0ef42-617c-4e6e-b675-1827079f7fc4).html")\["t", "a", "v"]\["diff-no", "lit"]\["Q", "I"]\["tt", "tr"]\\["Custom", "CS", "Formulas", "Figures", "Pseudo"]\["LSPI", "FQI"]\["exp"]\0\0\\["2D", "Simulation", "MountainCar", "InvertedPendulum", "CartPole", "ClassicControl"]\["TD", "batch"]\1\0\0\["Netherlands", "USA"]\["Maastricht University", "Lafayette College"]\[]\126751897\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\\\3\
2910166048\Knowledge Transfer between Multi-granularity Models for Reinforcement Learning\Some("https://dblp.uni-trier.de/db/conf/smc/smc2018.html#XinTWC18")\["v", "s_i", "s_f"]\["same_all"]\["Q", "options"]\["j", "tt"]\\["Custom", "CS", "Formulas", "Pseudo", "Figures"]\["Q", "MGRL"]\["exp"]\0\0\\["2D", "Navigation", "Grid", "Maze", "Simulation"]\["TD", "H"]\1\0\0\["China"]\["Nanjing University"]\["Control and Systems Engineering", "Management and Engineering"]\0\["h"]\["mag"]\["t", "r", "l", "k"]\["t", "r", "l", "k", "s"]\["t", "r", "l", "k", "s"]\\\\\
1604516390\Graph Laplacian based transfer learning in reinforcement learning\Some("https://dl.acm.org/citation.cfm?id=1402821.1402869")\["v", "s", "levels"]\["multi"]\["Q", "pvf"]\["j"]\\["Custom", "CS", "Pseudo", "Formulas", "Figures"]\["Q", "LaplacianGraph"]\["exp"]\0\0\\["2D", "Navigation", "Grid", "Maze", "Simulation"]\["TD"]\1\0\0\["Taiwan"]\["National Tsing-Hua University"]\["Computer Science"]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\\\\["Grid", "Obstacles", "TwoRoom"]
3016369563\Combined Model for Partially-Observable and Non-Observable Task Switching: Solving Hierarchical Reinforcement Learning Problems Statically and Dynamically with Transfer Learning.\Some("http://arxiv.org/pdf/2004.06213.pdf")\["s_i", "s_f", "t"]\["same_all"]\["fea", "model"]\[]\\["Custom", "CS", "Formulas", "Figures", "Tables"]\["Q"]\["exp"]\0\1\\["2D", "Navigation", "Grid", "Maze", "Simulation", "Hierarchical", "HolographicReducedRepresentations", "WorkingMemory", "AbstractTaskRepresentations"]\["TD", "H", "B"]\1\0\0\["USA"]\["Middle Tennessee State University"]\["Computer Science"]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\\2\\
2997729838\An Optimal Transfer of Knowledge in Reinforcement Learning through Greedy Approach\Some("https://dblp.uni-trier.de/db/conf/icccnt/icccnt2019.html#KumariCM19")\["a", "v"]\["diff-it"]\["Q"]\["tt"]\\["Custom", "CS", "Formulas", "Figures", "Tables"]\["SARSA", "MASTER"]\["sup"]\0\0\\["2D", "3D", "2Dto3D", "MountainCar", "MountainCar3D", "Simulation", "ClassicControl"]\["TD"]\1\1\1\["India"]\["REC Ambedkar Nagar"]\["Information Technology"]\0\["h"]\["mag"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\[["MountainCar", "MountainCar3D"]]\\\
2247618789\Reinforcement Learning Transfer Based on Subgoal Discovery and Subtask Similarity\None\["s", "t", "levels"]\["multI"]\["sub", "options"]\["j", "tt"]\\["Custom", "CS", "Pseudo", "Formulas", "Figures", "Tables"]\["Q", "SARSA", "SDHRL"]\["N/A"]\0\0\\["2D", "Navigation", "Maze", "Grid", "Simulation"]\["TD", "H"]\1\0\0\["China"]\["Nanjing University"]\["Computer Science and Technology"]\2150385772\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\\\3\
101901138\Improving Batch Reinforcement Learning Performance through Transfer of Samples\Some("https://dblp.uni-trier.de/db/conf/stairs/stairs2008.html#LazaricRB08")\["s_i", "s_f", "s"]\["multi"]\["I"]\["j", "tt"]\\["Custom", "CS", "Pseudo", "Formulas", "Figures", "Tables"]\["FQI"]\["N/A"]\0\0\\["2D", "Navigation", "Obstacle", "Samples"]\["TD", "batch"]\1\1\0\["Italy"]\["Politecnico di Milano"]\["IIT-Lab"]\0\["lib"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
2907903452\Simulation and Transfer of Reinforcement Learning Algorithms for Autonomous Obstacle Avoidance\Some("https://dblp.uni-trier.de/db/conf/ias/ias2018.html#LenkHMRS18")\["t"]\["sim2real"]\["pi"]\["j"]\\["Custom", "CS", "TensorFlow", "Keras", "Gym", "Figures"]\["Q", "DQN", "DDPG", "A3C"]\["N/A"]\0\1\\["Robotics", "Mindstorm", "Simulation", "RealWorld", "EV3", "Navigation", "3D", "Driving", "Autonomous", "Vehicles", "Car"]\["TD"]\1\1\1\["Germany"]\["SAP SE", "Duale Hochschule Baden-Württemberg"]\["Computer Science"]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l"]\\\\\
2889076726\Transfer of reinforcement learning for a robotic skill\Some("https://aaltodoc.aalto.fi:443/handle/123456789/33782")\["t"]\["sim2real"]\["pi", "curriculum"]\["j", "tt"]\\["Custom", "CS", "LWRSIM"]\["PILCO"]\[]\0\0\\["Robotics", "Mujoco", "RealWorld", "Simulation", "Kuka", "Arm", "LWRSim", "PILCO"]\["TD", "PS", "MB"]\1\0\0\["Finland"]\["Aalto University"]\["School of Electrical Engineering"]\2\["h"]\["mag"]\["t", "r", "l", "s"]\["t", "r", "l", "k", "s"]\["t", "r", "l", "k", "s"]\\\\\
3016335401\A Confrontation Decision-Making Method with Deep Reinforcement Learning and Knowledge Transfer for Multi-Agent System\Some("https://doi.org/10.3390/sym12040631")\["#"]\["multi", "multiagent"]\["pi"]\["tt"]\\["Custom", "CS", "Formulas", "Figures", "Pseudo", "Tables"]\["A2C", "DDPG", "MDDPG"]\["N/A"]\0\1\\["Simulation", "InvertedPendulum", "RTS", "Starcraft", "Multiagent", "ClassicControl"]\["TD"]\1\0\0\["China"]\["Hubei University of Arts and Science"]\["Computer Engineering"]\0\[]\["mag"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
3023661178\Sim-to-Real Transfer with Incremental Environment Complexity for Reinforcement Learning of Depth-Based Robot Navigation\Some("http://arxiv.org/pdf/2004.14684.pdf")\["s", "levels"]\["sim2real", "multi"]\["pi"]\[]\\["Custom", "CS", "Gazebo", "PyTorch", "Gym", "Formulas", "Figures", "Tables"]\["SAC"]\["N/A"]\0\1\\["Robotics", "Gazebo", "DepthBased", "3D", "Navigation", "Simulation", "RealWorld", "Vehicle"]\["TD"]\1\0\0\["France", "Australia"]\["Université Paris Saclay", "Flinders University"]\["Aerospace Lab", "Computer Science, Engineering and Mathematics"]\0\["all"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\\\\
2998972968\Activation and Spreading Sequence for Spreading Activation Policy Selection Method in Transfer Reinforcement Learning\Some("https://thesai.org/Downloads/Volume10No12/Paper_2-Activation_and_Spreading_Sequence_for_Spreading_Activation_Policy.pdf")\["s_i", "s_f"]\["multi"]\["pi", "rule"]\["j", "ap"]\\["Custom", "CS", "Formulas", "Figures", "Tables"]\["PRQ"]\["N/A"]\0\0\\["2D", "Navigation", "Grid", "Simulation"]\["TD", "PS"]\1\0\0\["Japan"]\["Tokyo Polytechnic University", "Tokyo Denki University", "The University of Tokyo"]\["Engineering", "Electronics and Mechatronics", "Information and Communication Engineering", "Precision Engineering"]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\\\\
112583961\Graph Laplacian Based Transfer Learning in Reinforcement Learning (Short Paper)\Some("http://www.ifaamas.org/Proceedings/aamas08/proceedings/pdf/paper/AAMAS08_0286.pdf")\["v", "s", "levels"]\["multi"]\["Q", "pi_gen"]\["j"]\\["Custom", "CS", "Pseudo", "Formulas", "Figures"]\["Q", "LaplacianGraph"]\["exp"]\0\0\\["2D", "Navigation", "Grid", "Maze", "Simulation"]\["TD"]\1\0\0\["Taiwan"]\["National Tsing-Hua University"]\["Computer Science"]\1604516390\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\\\3\
3007745042\Sim2Real Transfer for Reinforcement Learning without Dynamics Randomization.\Some("https://arxiv.org/abs/2002.11635")\["t"]\["sim2real"]\["pi"]\["j"]\\["Custom", "CS", "SurrealAI", "Pseudo", "Formulas", "Figures", "Tables", "Videos"]\["SAC", "PPO", "DDPG"]\[]\0\1\\["Robotics", "Simulation", "RealWorld", "Industry", "OSC"]\["TD", "PS"]\1\0\0\["Germany"]\["Corporate Research KUKA Deutschland GmbH"]\["Industry"]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\\2\\
201259648\Abstraction and Knowledge Transfer in Reinforcement Learning\Some("https://rd.springer.com/chapter/10.1007/978-3-642-16590-0_3")\[]\["theory", "bookchapter"]\[]\[]\\[]\[]\[]\2\2\\["Theory", "BookChapter"]\[]\1\1\1\[]\[]\[]\0\[]\["mag"]\["t", "r", "l", "k"]\[]\["t", "r", "l", "k"]\\\\\
2897816339\A Distributed Reinforcement Learning Solution With Knowledge Transfer Capability for A Bike Rebalancing Problem.\Some("https://arxiv.org/pdf/1810.04058")\["t", "s"]\["same_all"]\["Q", "distil"]\["j", "tr", "tt", "ap"]\\["Custom", "OSS", "pandas", "numpy", "Formulas", "Figures"]\["Q", "DiRL"]\["N/A"]\0\0\\["Simulation", "Bike", "Distribution", "Distributed", "Rebalancing", "Mobility"]\["TD"]\1\0\0\["USA"]\["New York University"]\[]\0\["h"]\["mag"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\["t", "r", "l", "k", "s"]\\\2\\
3019373547\TRANSFERABLE TRAINING FOR AUTOMATED REINFORCEMENT-LEARNING-BASED APPLICATION-MANAGERS\Some("https://lens.org/007-981-527-649-224")\[]\[]\[]\[]\\["Custom", "CS"]\[]\[]\2\2\\["ApplicationManager"]\[]\1\0\0\["USA"]\["VMWare Inc"]\["Industry", "Patent"]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\[]\\\\\
3018700377\Transfer Learning Applied to Reinforcement Learning-Based HVAC Control\Some("http://link.springer.com/content/pdf/10.1007/s42979-020-00146-7.pdf")\["t"]\["same_all"]\["Q"]\["j", "tt"]\\["Custom", "CS", "Pseudo", "Formulas", "Tables"]\["Q"]\["N/A"]\0\0\\["Simulation", "HVAC", "Power", "SmartEnergy", "Heat", "Ventilation"]\["TD", "B"]\1\0\0\["Ireland"]\["National University of Ireland"]\["Science and Engineering"]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
3005639539\Latent Structure Matching for Knowledge Transfer in Reinforcement Learning\Some("https://doi.org/10.3390/fi12020036")\["s", "levels"]\["multi"]\["Q", "manifold"]\["j", "tt", "ap"]\\["Custom", "CS", "Formulas", "Pseudo", "Gym", "Tables", "Figures"]\["Q", "LSM"]\["N/A"]\0\0\\["Simulation", "MountainCar", "ClassicControl"]\["TD"]\1\0\0\["China"]\["University of Shanghai"]\["Computer Engineering and Science"]\0\["h"]\["mag"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
3002235425\Transferring task models in Reinforcement Learning agents\Some("https://dl.acm.org/doi/10.1016/j.neucom.2012.08.039")\["v", "t"]\["diff-it"]\["model", "r"]\["tt", "tr"]\\["Custom", "CS", "Formulas", "Pseudo", "Figures", "Tables"]\["TiMRLA", "Q", "SARSA"]\["sup"]\0\0\\["2D", "3D", "2Dto3D", "MountainCar", "MountainCar3D", "Simulation", "ServerScheduling", "ClassicControl"]\["TD", "MB"]\1\0\0\["Greece", "France"]\["Aristotle University of Thessaloniki", "Université Joseph Fourier"]\["Informatics", "Laboratoire LIG"]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
2385762538\Knowledge Transfer Method Based on the Qualitative Model in Reinforcement Learning\Some("http://en.cnki.com.cn/Article_en/CJFDTOTAL-JSJK201106026.htm")\[]\[]\[]\[]\\[]\[]\[]\2\2\\[]\[]\0\1\0\["China"]\[]\[]\0\[]\["mag"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\[]\\\\\
2998368022\Decision Method of Policy Reuse Ratio for Transfer Reinforcement Learning\Some("https://www.jstage.jst.go.jp/article/jsmermd/2019/0/2019_1A1-P02/_pdf")\[]\[]\[]\[]\\[]\[]\[]\2\2\\[]\[]\0\1\0\["Japan"]\[]\[]\0\[]\["mag"]\[]\[]\[]\\\\\
2997158575\Embodiment Mapping Method between Heterogeneous Robots based on Self-body Representation for Transfer Learning in Reinforcement Learning\Some("https://www.jstage.jst.go.jp/article/jsmermd/2019/0/2019_1A1-P01/_pdf")\[]\[]\[]\[]\\["Custom", "CS"]\[]\[]\2\2\\["Robotics"]\[]\0\1\1\["Japan"]\[]\[]\0\[]\["mag"]\["t", "r", "l"]\[]\[]\\\\\
2326834951\Reinforcement transfer learning with feature information for robot motion planning\Some("http://library.witpress.com/viewpaper.asp?pcode=ACAR14-033-1")\[]\[]\[]\[]\\[]\[]\[]\2\2\\[]\[]\0\1\0\[]\[]\[]\0\[]\["mag"]\[]\[]\[]\\\\\
2131464614\Learning Transfer Automatic through Data Mining in Reinforcement Learning\Some("https://www.ijcaonline.org/archives/volume88/number13/15411-3885")\[]\["same_all"]\["exp"]\["tt"]\\["Custom", "CS", "Pseudo", "Figures"]\["LAT", "Q"]\["N/A"]\0\0\\["2D", "Navigation", "Simulation", "Grid"]\["TD"]\1\0\0\["Iran"]\["University of Bojnord", "Amirkabir University of Technology"]\["Computer Science"]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\\\\["Grid", "Rooms", "NonStandardThreeRoom"]
3010746488\Reinforcement learning-based collision avoidance: impact of reward function and knowledge transfer\Some("https://www.cambridge.org/core/services/aop-cambridge-core/content/view/S0890060420000141")\["s_i", "s_f", "levels"]\["multi"]\["r", "pi"]\["tt", "tr"]\\["Custom", "CS", "Formulas", "Figures", "Tables"]\["DQN"]\["N/A"]\0\1\\["2D", "Navigation", "Simulation", "CollisionAvoidance", "Vehicle"]\["TD"]\1\0\0\["USA"]\["University of Southern California"]\["Aerospace and Mechanical Engineering"]\0\["h", "lib"]\["mag"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\["t", "r", "l", "k", "s"]\\\\\
2986299952\Improving the learning speed in reinforcement learning issues based on the transfer learning of neuro-fuzzy knowledge\Some("https://tjee.tabrizu.ac.ir/article_9441_f0444730a3a1ac375a6c1b54e6330981.pdf")\[]\[]\["Q", "fea"]\[]\\["Custom", "CS"]\["Q"]\[]\0\0\\["Foreign Language"]\["TD"]\1\0\0\["Iran"]\["Science and Arts University"]\["Computer Engineering"]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l"]\\\\\
168011852\From Transfer to Scaling: Lessons Learned in Understanding Novel Reinforcement Learning Algorithms\Some("https://www.aaai.org/Papers/Workshops/2008/WS-08-14/WS08-14-005.pdf")\["#", "v"]\["same_all"]\["Q", "scaling"]\["tt", "j"]\\["Custom", "CS", "Formulas", "Figures"]\["Q"]\["N/A"]\0\0\\["2D", "Navigation", "Simulation", "Grid", "MultiAgent"]\["TD"]\1\0\0\["USA"]\["University of Maryland Baltimore County"]\["Computer Science and Electrical Engineering"]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\["Grid"]
3025362191\Accelerating bio-inspired optimizer with transfer reinforcement learning for reactive power optimization\Some("https://dl.acm.org/doi/10.1016/j.knosys.2016.10.024")\["t"]\["same_all"]\["Q"]\["j", "tt", "tr"]\\["Custom", "CS", "Pseudo", "Formulas", "Figures", "Tables"]\["ABO"]\[]\0\0\\["RPO", "Simulation", "IEEE118", "IEEE300", "MemoryMatrix", "Decomposition"]\["TD"]\1\1\1\["China"]\["South China University of Technology", "Kunming University of Science and Technology"]\["Electric Power Engineering"]\0\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
2604038904\Transfer reinforcement learning framework for energy saving in next generation wireless networks\Some("https://repository.iiitd.edu.in/jspui/handle/123456789/417")\["s_i"]\["same_all"]\["pi"]\["ap", "tr"]\\["Custom", "CS", "Formulas", "Figures", "Tables"]\["Actor-Critic"]\["N/A"]\0\0\\["Energy Saving", "Network", "WiFi", "RealWorld", "Simulation"]\["TD"]\1\0\0\["India"]\["Indraprastha Institute of Information Technology"]\["Electronics and Communication Engineering"]\2\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\\\\
2910425974\Knowledge transfer in deep reinforcement learning\Some("https://addi.ehu.es/handle/10810/30667")\["t", "levels"]\["multi"]\["pi"]\["j", "tt"]\\["Custom", "CS", "TensorFlow", "Figures", "Formulas", "Pseudo", "Keras", "Gym", "Tables"]\["DQN"]\["N/A"]\0\1\\["Simulation", "VideoGames", "Games", "Atari", "SpaceInvaders", "DemonAttack", "Qbert", "CNN"]\["TD"]\1\0\0\["Spain"]\["University of the Basque Country"]\["Computer Engineering and Intelligent Systems"]\2\["h"]\["mag"]\["t", "r", "l", "k"]\[]\[]\\\\\
867213644\Task Localization Similarity and Transfer; Towards a Reinforcement Learning Task Library System\Some("http://scholarsarchive.byu.edu/cgi/viewcontent.cgi?article=1578&context=etd")\["s_i", "s_f", "t", "levels"]\["multi"]\["pi", "pi_fix", "pi_dyn"]\["tt"]\\["Custom", "CS", "Formulas", "Figures", "Tables", "Theorem"]\["Q"]\["N/A"]\0\0\\["2D", "Navigation", "Grid", "Robotics", "NomadII", "Simulation"]\["TD", "H", "B"]\1\0\0\["USA"]\["Brigham Young University"]\["Computer Science"]\2\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k", "s"]\\\\\["Grid", "Obstacles"]
2769673101\Policy Forgetting Method with Step Units for Decresing of Negative Transfer in Transfer Learning of Reinforcement learning\Some("https://www.jstage.jst.go.jp/article/jsmermd/2017/0/2017_2P1-F06/_article/-char/ja/")\[]\[]\[]\[]\\["Custom", "CS"]\[]\[]\2\2\\["Robotics"]\["TD"]\0\1\0\["Japan"]\[]\[]\0\[]\["mag"]\["t", "r", "l"]\[]\[]\\\\\
2975838601\A knowledge transfer combined reinforcement learning method and a learning method applied to autonomous skills of an unmanned vehicle\Some("https://lens.org/133-719-533-054-272")\[]\[]\[]\[]\\["Custom", "CS"]\[]\[]\2\2\\["Patent", "UAV", "Robotics", "Aerial"]\[]\0\0\0\["China"]\["University of Shanghai"]\["Patent"]\0\[]\["mag"]\["t", "r", "l", "k", "s"]\["t", "r", "l", "k", "s"]\[]\\\\\
2967023422\Aircraft full-automatic pneumatic optimization method based on reinforcement learning and transfer learning\Some("https://lens.org/148-968-536-858-673")\[]\[]\[]\[]\\["Custom", "CS", "Formulas"]\[]\[]\2\2\\["Patent", "Aircraft", "Pneumatic", "Robotics", "Aerial"]\[]\0\0\0\["China"]\["Tsinghua University"]\["Patent"]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\[]\\\\\
2091079925\Tactual-kinesthetic feedback from manipulation of visual forms and nondifferential reinforcement in transfer of perceptual learning\Some("https://philpapers.org/rec/BENTFF-2")\[]\[]\[]\[]\\[]\[]\[]\2\2\\["Psychology"]\[]\1\1\0\["USA"]\[]\[]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l"]\\\\\
2970910303\Human Sub-goal Transfer in Hierarchical Reinforcement Learning\Some("https://confit.atlas.jp/guide/event/jsai2019/subject/1Q2-J-2-02/detail")\[]\["same_all"]\["options"]\[]\\["Custom", "CS", "Pseudo", "Tables", "Figures", "Formulas"]\["Q"]\[]\0\0\\["SubGoal", "Transfer", "Simulation"]\["H", "TD"]\1\0\0\["Japan"]\["SOKENDAI", "National Institute of Informatics"]\[]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\[]\\\\\
3005714659\Transfer Learning for Reinforcement Learning Domains: A Survey\Some("https://dl.acm.org/doi/10.5555/1577069.1755839")\[]\["theory", "survey"]\[]\[]\\[]\[]\[]\2\2\\["Theory", "survey"]\[]\1\0\0\["USA"]\["University of Southern California", "The University of Texas"]\["Computer Science"]\0\[]\["mag", "lazaric", "silva", "cur", "zhu"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k", "s"]\\\\\
2914188912\Transfer reinforcement learning for task-oriented dialogue systems\Some("http://repository.ust.hk/ir/Record/1783.1-92234")\["s", "t"]\["multi"]\["WT", "special_weights"]\["ap"]\\["Custom", "CS", "Formulas", "Figures", "Tables", "Pseudo"]\["Q", "PETAL", "PROMISE"]\["exp"]\0\0\\["Task", "Oriented", "Dialogue", "System", "CNN", "NLP", "Simulation"]\["TD", "H", "B"]\1\0\0\["China"]\["Hong Kong University of Science and Technology"]\["Computer Science and Engineering"]\1\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k", "s"]\\\\\
2945850821\Improving Deep Reinforcement Learning Using Graph Convolution and Visual Domain Transfer\Some("https://tigerprints.clemson.edu/all_dissertations/2268/")\["s"]\["same_all"]\["Q"]\["tr", "j"]\\["Custom", "CS", "Formulas", "Pseudo", "Theorem", "TensorFlow", "Tables", "Figures"]\["GVIN", "VIN", "DQN"]\[]\0\1\\["2D", "Simulation", "Graph", "IrregularGraph", "RoadNetwork", "Navigation", "CARLA", "CNN", "MonteCarlo"]\["TD", "MB"]\1\0\0\["USA"]\["Clemson University"]\["Computer Engineering"]\1\[]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\\\\
2886781774\Transfer Surface for Transfer Learning in Reinforcement Learning\Some("https://confit.atlas.jp/guide/event/ssii2018/subject/IS2-27/detail")\[]\[]\[]\[]\\[]\[]\[]\2\2\\[]\[]\0\1\0\["Japan"]\["Tokyo Polytechnic University", "The University of Tokyo", "Tokyo Denki University"]\[]\0\[]\["mag"]\["t", "r", "l"]\[]\[]\\\\\
1483433356\Skill Transfer of a Mobile Robot Obtained by Reinforcement Learning to a Different Mobile Robot\Some("https://dblp.uni-trier.de/db/series/sci/sci266.html#KameiI10")\["t"]\["same_all"]\["Q"]\["j"]\\["Custom", "CS", "Formulas", "Figures"]\["Q"]\["N/A"]\0\0\\["Simulation", "RealWorld", "Robot", "Navigation", "2D", "GeneticAlgorithm", "Vehicle", "MultiAgent"]\["TD", "H", "B"]\1\1\1\["Japan"]\["Nishinippon Institute of Technology", "Kyushu Institute of Technology"]\["Electrical, Electronics and Information Technology"]\0\[]\["mag"]\["t", "r", "l", "s"]\["t", "r", "l", "s"]\["t", "r", "l", "s"]\\\\\["Obstacles"]
2770456017\Multi-Agent Reinforcement Learning: - Learning Effects by Transfer Learning Eligibility Trace and Sensor Range -,,,―転移学習,適格度,センサ範囲の学習効果―\Some("https://www.jstage.jst.go.jp/article/jsmermd/2017/0/2017_2A1-G09/_article/-char/ja/")\[]\["multi", "multiagent"]\["rule"]\[]\\["Custom", "CS"]\["Q"]\[]\2\0\\["Simulation", "NarrowPath", "Rules"]\["TD"]\0\1\0\["Japan"]\[]\[]\0\[]\["mag"]\[]\[]\[]\\\\\
2778259398\Evolutionary transfer learning for complex multi-agent reinforcement learning systems\Some("https://dr.ntu.edu.sg/handle/10356/72997")\["s_i", "s_f", "levels"]\["multi"]\["Q", "advisor"]\["ap"]\\["Custom", "CS", "Pseudo", "Formulas", "Figures", "Tables"]\["Q", "FALCON", "BP", "eTL", "PTL", "AE-AVG", "MAS"]\["N/A"]\0\1\\["2D", "3D", "Simulation", "Games", "Navigation", "Minefield", "MinefieldNavigation", "UnrealTournament", "MultiAgent"]\["TD", "MB"]\1\0\0\["Singapore"]\["Nanyang Technological University"]\["Interdisciplinary Graduate School"]\1\["all"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k", "s"]\\\\\
2292364897\Transfer in reinforcement learning\Some("https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.675561")\["t"]\["same_all"]\["Q", "pri"]\["j", "tr", "tt"]\\["Custom", "CS", "Pseudo", "Formulas", "Theorem", "Figures", "Tables", "Lemma"]\["Q", "SARSA"]\[]\0\0\\["Simulation", "CartPole", "MountainCar", "Acrobot", "Pinball", "ClassicControl"]\["TD", "MB"]\1\0\0\["UK"]\["University of Aberdeen"]\["Computer Science"]\1\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "s", "k"]\["t", "r", "l", "s", "k"]\\\\\
1511270451\Graph Laplacian Based Transfer Learning Methods in Reinforcement Learning\Some("http://cdn.intechweb.org/pdfs/10989.pdf")\["v", "s", "levels", "s_i", "s_f"]\["multi"]\["Q", "pvf"]\["j"]\\["Custom", "CS", "Pseudo", "Formulas", "Figures"]\["Q", "LaplacianGraph"]\["exp"]\0\0\\["2D", "Navigation", "Grid", "Maze", "Simulation"]\["TD"]\1\0\0\["Taiwan"]\["National Tsing-Hua University"]\["Computer Science"]\1604516390\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\\\8\["Grid", "Maze"]
2559014014\Parallel Transfer Learning: Accelerating Reinforcement Learning in Multi-Agent Systems\Some("http://www.tara.tcd.ie/handle/2262/77886")\[]\["same_all"]\["Q", "r", "imitation"]\["j", "tt", "ap"]\\["Custom", "CS", "Pseudo", "Figures", "Tables", "Formulas", "GridLab"]\["Q", "DistributedW"]\["exp"]\0\0\\["MultiAgent", "Simulation", "MountainCar", "CartPole", "GridLab","SmartGrid", "Parallel", "ClassicControl"]\["TD"]\1\0\0\["Ireland"]\["University of Dublin"]\["Computer Science"]\1\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
2473744971\Policy abstraction for transfer learning using learning vector quantization in reinforcement learning framework\Some("http://umpir.ump.edu.my/id/eprint/13521/")\["s_i", "s_f", "levels", "t"]\["multi"]\["pri"]\["tt"]\\["Custom", "CS", "Formulas", "Figures", "Tables", "Pseudo"]\["Q", "LVQ"]\["N/A"]\0\0\\["2D", "Navigation", "Grid", "Maze", "Simulation"]\["TD"]\1\0\0\["Japan"]\["Kyushu University"]\["Electrical and Electronic Engineering"]\1\["h"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l"]\\\\\
2745684090\Effects of catechol-O-methyltransferase on reinforcement learning and mesolimbic dopamine transmission\Some("https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.714050")\[]\[]\[]\[]\\[]\[]\[]\2\2\\["Neurosciences", "neurology"]\[]\1\0\0\["UK"]\["University of Oxford"]\[]\1\[]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l"]\\\\\
2613227537\Sample Complexity Reduction in Reinforcement Learning by Transferred Transition and Reward Probability\Some("https://www.ieice.org/ken/paper/20131113eB7B/eng/")\["t"]\["multi"]\["fea", "r"]\[]\\["Custom", "CS"]\["TR-MAX"]\[]\0\0\\["MDP", "PAC-MDP"]\["TD", "MB"]\0\1\0\["Japan"]\["Tohoku University"]\["Graduate School of Information Sciences"]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\[]\\\\\
2523226036\Integrating policy transfer policy reuse and experience replay in speeding up reinforcement learning of the obstacle avoidance task\Some("http://erepository.uonbi.ac.ke/handle/11295/90177")\["a"]\["multi"]\["pi", "experiencereplay"]\["j", "tt", "ap"]\\["Custom", "CS", "Figures", "Formulas", "Tables", "Pseudo"]\["SARSA"]\["sup"]\0\0\\["3D", "Navigation", "Robotics", "Obstacle", "Avoidance", "ExperienceReplay", "Vehicle", "Simulation"]\["TD", "H"]\1\0\0\["Kenia"]\["University of Nairobi"]\["Computing and Informatics"]\1\[]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k", "s"]\\\\\
1470793189\Transferring Human Navigation Behaviors Into Robot Planning using Inverse Reinforcement Learning Techniques\Some("https://idus.us.es/xmlui/handle/11441/26823")\["s"]\["same_all"]\["I"]\["j", "tt"]\\["Custom", "CS", "Formulas", "Figures", "Tables"]\["GPIRL", "PRX"]\["N/A"]\0\0\\["Inverse", "IRL", "RealWorld", "Simulation", "Navigation", "Human", "Demonstration"]\["TD", "B"]\1\0\0\["Spain"]\["University of Seville"]\["Systems Engineering"]\1\["all"]\["mag"]\["t", "r", "l"]\[]\["t", "r", "l", "k", "s"]\\\\\["RealRobot"]
1975326778\A Container Transfer Scheduling Using Reinforcement Learning\Some("https://www.jstage.jst.go.jp/article/ieejias/123/10/123_10_1111/_pdf")\[]\["pure"]\[]\[]\\[]\[]\[]\2\2\\["BlockStacking", "ContainerTerminal", "ContainerScheduling", "Simulation"]\[]\1\0\0\["Japan"]\["Hiroshima Research & Developement Center"]\["Industry"]\0\[]\["mag"]\["t", "r", "l"]\["r", "l"]\[]\\\\\
2327840474\The Knowledge Transfer Methods and Its Application in Robot Based on Subtask Hierarchical Reinforcement Learning\Some("http://www.aicit.org/aiss/global/paper_detail.html?jname=AISS&q=1770")\[]\[]\[]\[]\\[]\[]\[]\2\2\\["Robotics", "Cleaning", "RealWorld"]\[]\0\1\0\[]\[]\[]\0\[]\["mag"]\[]\[]\[]\\\\\
1490764095\An Intelligent Marshaling Based on Transfer Distance of Containers Using a New Reinforcement Learning for Logistics\Some("http://cdn.intechopen.com/pdfs/13071/InTech-An_intelligent_marshaling_based_on_transfer_distance_of_containers_using_a_new_reinforcement_learning_for_logistics.pdf")\[]\["pure"]\[]\[]\\[]\[]\[]\2\2\\["BlockStacking", "ContainerTerminal", "Simulation"]\[]\1\0\0\["Japan"]\["Osaka Institute of Technology"]\[]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l"]\\\\\
2340098619\A New Reinforcement Learning Method for Train Marshaling Based on the Transfer Distance of\Some("http://www.iaeng.org/publication/IMECS2011/IMECS2011_pp80-85.pdf")\[]\["pure"]\[]\[]\\[]\[]\[]\2\2\\["BlockStacking", "ContainerTerminal", "ContainerScheduling", "Locomotive", "Simulation"]\[]\1\0\0\["Japan"]\["Osaka Institute of Technology"]\[]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l"]\\\\\
2094258649\Acquisition and transfer of conceptual learning by normals and retardates: the effects of modeling verbal cues and reinforcement.\Some("http://www.ncbi.nlm.nih.gov/pubmed/4429530")\[]\[]\[]\[]\\[]\[]\[]\2\2\\["Psychology"]\[]\0\1\0\["USA"]\["University of Georgia"]\[]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\[]\\\\\
2182493345\Task Similarity Measures forTransferin Reinforcement LearningTask Libraries\None\["s_i", "s_f", "levels"]\["multi"]\["fea"]\["ap"]\\["Custom", "CS", "Figures", "Formulas", "Tables"]\["Q"]\["N/A"]\0\0\\["Simulation", "2D", "Navigation", "Grid", "MovingGoal"]\["TD"]\1\0\0\["USA"]\["Brigham Young University"]\["Computer Science"]\1582256513\["all"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l"]\\\\3\
2184525547\ERRORLESSLEARNING:REINFORCEMENTCONTINGENCIESAND STIMULUSCONTROLTRANSFERINDELAYED PROMPTING\None\[]\["Neurology"]\[]\[]\\[]\[]\[]\2\2\\["Neurology"]\[]\1\0\0\["USA"]\["Harvard Medical School"]\["Neurology"]\0\[]\["mag"]\[]\[]\[]\\\\\
266489658\PREDICTION OF MIXED SCHEMA LEARNING IN A REPRODUCTION TASK. THE EFFECTS OF INCIDENTAL LEARNING AND REINFORCEMENT ON SCHEMATA LEARNING AND SCHEMATA TRANSFER. INTERIM REPORT.\Some("https://eric.ed.gov/?id=ED016261")\[]\["psychology"]\[]\[]\\[]\[]\[]\2\2\\["Psychology"]\[]\1\0\0\["USA"]\["U.S. DEPARTMENT OFHEALTH, EDUCATION, AND WELFARE"]\[]\0\[]\["mag"]\[]\[]\[]\\\\\
24477102\Training and Tracking in robotics\Some("https://www.ijcai.org/Proceedings/85-1/Papers/129a.pdf")\["t"]\["same_all"]\["Q"]\["tt"]\\["Custom", "CS", "Formulas"]\["Q"]\["N/A"]\0\0\\["CartPole", "Simulation", "ClassicControl"]\["TD"]\1\0\0\["UK"]\["G T E Laba","University of Massachusetts"]\["Computer Science"]\0\["h"]\["taylor"]\[]\["r", "l"]\["l"]\\\\\
1584313244\Vision-based Behavior Acquisition for a Shooting Robot by using RL by Asada\Some("http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.54.6159&rep=rep1&type=pdf")\["s_i"]\["same_all"]\["Q"]\["tt"]\\["Custom", "CS", "Figures", "Formulas", "Pseudo", "Tables"]\["Q"]\["N/A"]\0\0\\["Robotics", "Navigation", "RealWorld"]\["TD"]\1\0\0\["Japan"]\["Osaka University"]\["Mechanical Engineering"]\0\["h"]\["taylor"]\["r", "l"]\["r", "l"]\["r", "l"]\\\\\["RealRobot"]
2012036715\Transfer of learning by composing solutions of elemental sequential tasks by Satinder Singh\Some("https://link.springer.com/content/pdf/10.1007/BF00992700.pdf")\["r"]\["same_all"]\["Q"]\["ap", "tr"]\\["Custom", "CS", "Formulas", "Figures", "Tables"]\["Q", "CQL"]\["N/A"]\0\0\\["2D", "Grid", "Navigation", "Simulation"]\["TD"]\1\0\0\["USA"]\["University of Massachusetts"]\["Computer Science"]\0\["all"]\["taylor"]\["t", "l"]\["t", "r", "l"]\["t", "r", "l"]\\\\\["Grid", "Obstacles"]
2117629901\A comparison of direct and model-based RL\\["r"]\["same_all"]\["model"]\["j", "ap", "tr"]\\["Custom", "CS", "Formulas", "Figures"]\["Q", "CMAC"]\["N/A"]\0\0\\["Robotics", "Simulation"]\["MB"]\1\1\1\["USA"]\["Georgia Institute of Technology"]\["College of Computing"]\0\["all"]\["taylor"]\["r", "l"]\["r", "l"]\["t", "r", "l"]\\\\\
1822705290\Effective control knowledge transfer through learning skill and representation hierarchies\\["r"]\["same_all"]\["options"]\["tt"]\\["Custom", "CS", "Formulas", "Figures", "Web", "Dead", "UrbanCombatTestbed"]\["Q"]\["N/A"]\0\0\\["3D", "Navigation", "Simulation"]\["H"]\1\0\0\["USA"]\["West Chester University of Pennsylvania", "The University of Texas"]\["Computer Science", "Computer Science and Engineering"]\0\["h"]\["taylor"]\["t", "l"]\["t", "r", "l"]\["t", "r", "l"]\\\\\
2089561656\State Abstraction for programmable reinforcement learning\\["r", "s_i", "s_f"]\["same_all"]\["pi"]\["tr"]\\["Custom", "CS", "Figures", "Formulas", "Pseudo", "Tables", "Theorem"]\["Q", "MAXQ", "Alisp"]\["N/A"]\0\0\\["2D", "Grid", "Navigation", "Taxi", "Simulation"]\["H"]\1\0\0\["USA"]\["UC Berkley"]\["Computer Science"]\0\["h"]\["taylor"]\["r", "l"]\["t", "r", "l"]\["t", "r", "l"]\\\\\["Taxi", "Grid"]
1612195517\Relativized Options: Choosing the right transformation\\["t", "s_i", "s_f"]\["same_all"]\["options"]\["tr"]\\["Custom", "CS", "Formulas", "Figures", "Web", "Dead"]\["Q", "SMDP-Q"]\["N/A"]\0\0\\["2D", "Grid", "Navigation", "Collect", "Simulation"]\["TD", "H"]\1\0\0\["USA"]\["University of Massachusetts"]\["Computer Science"]\0\["h"]\["taylor"]\[]\["l"]\["t", "r", "l"]\\\\\["Grid", "Rooms"]
1607318605\Proto-transfer learning in Markov decision process using spectral methods\\["r", "s_i", "s_f"]\["same_all"]\["pvf"]\["tt"]\\["Custom", "CS", "Pseudo", "Figures", "Formulas", "Tables"]\["Q", "PVF"]\["N/A"]\0\0\\["2D", "Grid", "Navigation", "Simulation"]\["batch"]\1\0\0\["USA"]\["University of Massachusetts"]\["Computer Science"]\0\["h"]\["taylor"]\[]\["t", "r", "l"]\["t", "r", "l"]\\\\\["Grid", "TwoRoom", "Rooms"]
2165792602\Improving action selection in MDP’s via knowledge transfer\Some("https://www.aaai.org/Papers/AAAI/2005/AAAI05-162.pdf")\["s_f", "t"]\["same_all"]\["A"]\["tr"]\\["Custom", "CS", "Pseudo", "Formulas", "Figures", "Lemma", "Theorem"]\["Q", "RTP"]\["N/A"]\0\0\\["2D", "Grid", "Navigation", "Simulation"]\["TD"]\1\0\0\["USA"]\["The University of Texas"]\["Computer Science"]\0\["mod"]\["taylor"]\["k", "t"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
1598748993\Structure in the space of value functions\Some("https://link.springer.com/content/pdf/10.1023/A:1017944732463.pdf")\["s_f"]\["multi"]\["sub"]\["j", "tr"]\\["Custom", "CS", "Figures", "Formulas"]\[]\["N/A"]\0\0\\["2D", "Grid", "Navigation", "Simulation"]\["TD", "H"]\1\0\0\["UK"]\["University of Edinburgh"]\["Computational Neuroscience"]\0\["all"]\["taylor"]\["r", "l"]\["r", "l"]\["r", "l", "k"]\\\\\["Grid", "Maze"]
1974043469\Probabilistic Policy Reuse in a Reinforcement LearningAgent\Some("https://www.researchgate.net/profile/Fernando_Fernandez9/publication/221455416_Probabilistic_policy_reuse_in_a_reinforcement_learning_agent/links/0deec5188b4c49dce6000000.pdf")\["s_i", "s_f"]\["multi"]\["pi"]\["tr"]\\["Custom", "CS", "Pseudo", "Formulas", "Figures"]\["Q"]\["N/A"]\0\0\\["2D", "Grid", "Navigation", "Simulation"]\["TD"]\1\0\0\["USA"]\["Carnegie Mellon University"]\[]\0\["lib"]\["taylor"]\["r", "l"]\["l"]\["t", "r", "l", "k"]\\\\\["Grid", "Rooms"]
2042357378\Multitask reinforcement learning on the distribution of MDPs\Some("http://fumihide-tanaka.org/lab/content/files/research/Tanaka_CIRA-03.pdf"9\["t"]\["multi"]\["Q"]\["j", "tr"]\\["Custom", "CS", "Pseudo", "Formulas", "Figures"]\["Q", "BPSb"]\["N/A"]\0\0\\["2D", "Grid", "Navigation", "Simulation"]\["TD"]\1\0\0\["Japan"]\["Tokyo Institute of Technology"]\["Computational Intelligence and Systems Science"]\0\["all"]\["taylor"]\["r", "l"]\["r", "l"]\["t", "r", "l"]\\\\\["Grid", "Maze"]
1234567\Model Transfer for Markov Decision Tasks via Parameter Matching\Some("https://d1wqtxts1xzle7.cloudfront.net/46415887/sunmola.pdf?1465736901=&response-content-disposition=inline%3B+filename%3DModel_transfer_for_Markov_decision_tasks.pdf&Expires=1600942573&Signature=atTnvjL7jeDD4y14s3BhAh7tAjkakWokPE5bPdMQ1LlMBrQZXyC9MvdejtGlscqunfUu7Z8chRUA2Q3xHD7ru7GO1XbeHS5XdJjq~9BMj2FO7GBZPbl4UK2n9ISGPh6~HkrfSx3A~lrBQsezoRXsk13B7J4yLO2jrb0ATOwb4H3sxUaqFz5kpD1cmH-noDDVg5rFvxNFe7VyAWlt5MP-j-47DYyrhfGOFfDV5L5xLuci1~t50l6Mvh4ybKgGNQQBCo1laKi2AHhrZjdlMIa1o2yEGqNez45exaJBR3PM1cD6NPkMxD1XQqzeDgAN1CS5DFRu5EUz7Z0GYOZhrooT1Q__&Key-Pair-Id=APKAJLOHF5GGSLRBV4ZA")\["t"]\["multi"]\["pri"]\["j", "tr"]\\["Custom", "CS", "Pseudo", "Formulas", "Figures", "Tables"]\["Bayesian"]\["N/A"]\0\0\\["MDP"]\["B"]\1\0\0\["UK"]\["University of Birmingham", "University Hospital Birmingham"]\["Computer Science"]\0\["all"]\["taylor"]\["t"]\["t", "k"]\["t", "r", "l", "k"]\\\\\
2169743339\Multi-task reinforcement learning: a hierarchical Bayesian approach\Some("http://web.engr.oregonstate.edu/~tadepall/papers/mtrl-icml07.pdf")\["r", "s_f"]\["multi"]\["pri"]\["j", "tr"]\\["Custom", "CS", "Pseudo", "Formulas", "Figures"]\["Bayesian"]\["N/A"]\0\0\\["2D", "Navigation", "Maze", "Grid"]\["B"]\1\0\0\["USA"]\["University of Massachusetts"]\["Electrical Engineering and Computer Sciences"]\0\["all"]\["taylor"]\["r", "l"]\["r", "l"]\["t", "r", "l", "k"]\\\\\
7654321\Transferring state abstractions between MDPs\Some("https://www.academia.edu/download/43419216/transferWorkshop.pdf")\["r", "s"]\["multi"]\["fea"]\["tt"]\\["Formulas", "Figures", "Theorem"]\["GATA"]\["N/A"]\0\0\\["MDP"]\["any"]\1\0\0\["USA"]\["Rutgers University"]\["Computer Science"]\0\["all"]\["taylor"]\["t"]\["t", "l", "k"]\["t", "r", "l", "k"]\\\\\
2106953752\General game learning using knowledge transfer\Some("https://www.aaai.org/Papers/IJCAI/2007/IJCAI07-107.pdf")\["a", "v"]\["diff-no"]\["fea"]\["ap", "j", "tr"]\\["Custom", "CS", "Formulas", "Figures", "Tables"]\["Q"]\["N/A"]\0\0\\["Games", "GGP", "BoardGames", "Connect3", "CaptureGo", "Othello", "Simulation", "Games"]\["TD"]\1\0\0\["USA"]\["The University of Texas"]\["Computer Science"]\0\["h"]\["taylor"]\["t", "l", "k"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
2134153324\Generalizing plans to new environments in relational MDPs\Some("http://www.cs.ucf.edu/~gitars/cap6938/guestrin03generalizing.pdf")\["#"]\["diff-no"]\["Q"]\["j"]\\["Custom", "CS", "Figures", "Formulas", "Videos", "Theorem", "Lemma"]\["CPLEX"]\["N/A"]\0\0\\["RTS", "Freecraft", "Games", "VideoGames", "Simulation"]\["LP"]\1\0\0\["USA"]\["Stanford University"]\["Computer Science"]\0\["h"]\["taylor"]\[]\[]\["r", "l"]\\\\\
2166798247\Transfer learning in real-time strategy games using hybrid CBR/RL\Some("https://www.aaai.org/Papers/IJCAI/2007/IJCAI07-168.pdf")\["#"]\["diff-no"]\["Q"]\["j", "tr"]\\["Custom", "CS", "Figures", "Tables", "Formulas"]\["Q"]\["N/A"]\0\0\\["RTS", "MadRTS", "Games", "VideoGames", "Simulation"]\["TD", "CBR"]\1\0\0\["USA"]\["Georgia Institute of Technology"]\["College of Computing"]\0\["h"]\["taylor"]\["t", "l", "rl"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\\
2122982548\Giving advice about preferred actions to reinforcement learners via knowledge-based kernel regression\Some("https://www.aaai.org/Papers/AAAI/2005/AAAI05-129.pdf")\["a", "r", "v"]\["diff-it"]\["rule", "advisor"]\["j", "tr"]\\["Custom", "CS", "Formulas", "Figures"]\["Q"]\["sup"]\0\0\\["2D", "RoboCup", "BreakAway", "Simulation", "MultiAgent", "3v2"]\["TD"]\1\0\0\["USA"]\["University of Minnesota", "University of Wisconsin"]\["Computer Science"]\0\["h"]\["taylor"]\["r", "l", "k"]\["r", "l", "k"]\["r", "l", "k"]\\\\\
1506146479\Skill acquisition via transfer learning and advice taking\Some("https://link.springer.com/content/pdf/10.1007/11871842_41.pdf")\["a", "r", "v"]\["diff-it"]\["rule", "advisor"]\["j", "tr"]\\["Custom", "CS", "Aleph", "Pseudo", "Tables", "Figures", "Formulas"]\["Q"]\["sup"]\0\0\\["2D", "RoboCup", "BreakAway", "Simulation", "MultiAgent", "4v3", "3v2", "MoveDownfield", "KeepAway"]\["TD"]\1\0\0\["USA"]\["University of Minnesota", "University of Wisconsin"]\["Computer Science"]\0\["h"]\["taylor"]\["t", "l", "s"]\["t", "r", "l", "s"]\["t", "r", "l", "s", "k"]\\[["KeepAway4v3", "Breakaway3v2"], ["MoveDownfield3v2", "Breakaway3v2"]]\\\
2123995443\Graph-based domain mapping for transfer learning in general games\Some("https://link.springer.com/content/pdf/10.1007/978-3-540-74958-5_20.pdf")\["a", "v"]\["lit"]\["Q"]\["j", "tr"]\\["Custom", "CS", "Pseudo", "Figures", "Formulas", "Tables", "Theorem"]\["Q", "RuleGraph"]\["T"]\0\0\\["GGP", "Simulation", "Games", "BoardGames", "MiniChess"]\["TD"]\1\0\0\["USA"]\["The University of Texas"]\["Computer Science"]\0\["h"]\["taylor"]\["t", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\[["TicTacToe", "MiniChess"], ["Checkers", "MiniChess"], ["MiniChess4", "MiniChess5"]]\\\
203338875\An experts algorithm for transfer learning\Some("https://www.aaai.org/Papers/IJCAI/2007/IJCAI07-172.pdf")\["a", "v"]\["lit"]\["N/A"]\["j"]\\["Custom", "CS", "Pseudo", "Formulas", "Figures", "Theorem"]\["SARSA", "AtEasel"]\["Ma", "svg", "exp"]\0\0\\["2D", "RoboCup", "BreakAway", "Simulation", "MultiAgent"]\["all"]\1\0\0\["USA"]\["University of Michigan"]\["Computer Science and Engineering"]\0\["h"]\["taylor"]\["t", "l"]\["t", "l", "k"]\["t", "r", "l", "k"]\\[["RoboCup3v2", "RoboCup4v2"]]\\\
2138412825\Automatically Mapped Transfer between Reinforcement Learning Tasks via Three-Way Restricted Boltzmann Machines (abstract)\Some("https://research.tue.nl/en/publications/automatically-mapped-transfer-between-reinforcement-learning-task-2")\["a", "t", "r"]\["lit"]\["pi"]\["j", "tt"]\\["Custom", "CS", "Formulas", "Pseudo", "Figures"]\["RBM"]\["exp"]\1\1\\["Simulation", "MountainCar", "CartPole", "RBM", "ClassicControl"]\["TD"]\1\0\0\["Netherlands", "USA"]\["Maastricht University", "Washington State University"]\["Knowledge Engineering", "Electrical Engineering and Computer Science"]\0\["all"]\["mag"]\["t", "r", "l"]\["t", "r", "l"]\["t", "r", "l", "k"]\\[["Inverted Pendulum", "CartPole"], ["MountainCar", "CartPole"]]\\3\
2887657794\Self-organizing maps for storage and transfer of knowledge in reinforcement learning (abst)\Some("https://doi.org/10.1177/1059712318818568")\["s_i", "s_f"]\["same_all"]\["Q"]\["tr"]\\["Custom", "CS", "Pseudo", "Formulas", "Figures"]\["SOM", "Q"]\["SOM"]\0\0\\["SOF", "SelfOrganizingMap", "2D", "Navigation", "Simulation"]\["TD"]\1\0\0\["Singapore"]\["Singapore University of Technology and Design"]\["Computer Science"]\0\["lib"]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k"]\\\\3\
2959558594\Credit Assignment as a Proxy for Transfer in Reinforcement Learning.\Some("https://arxiv.org/abs/1907.08027")\["v", "s_i", "s_f", "t", "r"]\["multi"]\["r", "credit"]\["tt"]\\["Custom", "CS", "Formulas", "Figures"]\["Q", "PPO", "DQN"]\["N/A"]\0\1\\["2D", "Navigation", "Grid", "DMLab", "Credit", "Simulation"]\["TD"]\1\0\0\["USA"]\["Google Research"]\["Industry"]\0\[]\["mag"]\["t", "r", "l"]\["t", "r", "l", "k"]\["t", "r", "l", "k", "s"]\\\1\3\