agi-risk/26-Negative-Arguments.html at main · HyperCrowd/agi-risk · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
<!DOCTYPE html>
<html lang="en">
<head>
  <!-- Basic Meta Tags -->
  <meta charset="UTF-8">
  <meta http-equiv="X-UA-Compatible" content="IE=edge">
  <meta name="viewport" content="width=device-width, initial-scale=1.0">

  <!-- SEO Meta Tags -->
  <meta name="description" content="Comprehensive AGI Risk Analysis">
  <meta name="keywords" content="agi, risk, convergence">
  <meta name="author" content="Forrest Landry">
  <meta name="robots" content="index, follow">

  <!-- Favicon -->
  <link rel="icon" href="https://github.githubassets.com/favicons/favicon-dark.png" type="image/png">
  <link rel="shortcut icon" href="https://github.githubassets.com/favicons/favicon-dark.png" type="image/png">

  <!-- Page Title (displayed on the browser tab) -->
  <title>Comprehensive AGI Risk Analysis</title>
</head>
<body>
  <p>
  TITL:
     <b>"Negative Arguments"</b>
     <b>By Forrest Landry</b>
     <b>Oct 10th, 2022</b>.
  </p>
  <p>
  ABST:
  </p>
  <p>
     As about the distinction of arguing against something,
     (as a kind of a report of bad news)
     vs arguing for something (positive/beneficial).
  </p>
  <p>
  TEXT:
  </p>
  <p>
     Technical alignment rests on a premise:
       That all of a series of
       by-default-deficient crucial factors
       can be solved as to be 'not-deficient'.
  </p>
  <p>
     It tacitly assumes a positively specified hypothesis:
       That <b>all</b> of the relevant critical factors
       (that must be) can be, an can at once be
       simultaneously and continuously present
       so as to ensure both
       that generally-capable self-learning machines
       continue to exist <b>AND</b> will elect to keep
       the conditions of the ambient environment
       within the narrow ranges of magnitudes
       that humans (and other life) needs to survive.
  </p>
  <p>
     That this positive condition constitutes a very
     minimal-threshold definition of AGI alignment
     and/or a very minimal definition of 'AGI safety'.
  </p>
  <p>
     Proving alignment of future AGI to be possible,
     even as a lower-bound statistical guarantee,
     is a conceptually daunting task.
     It depends on unknown unknowns --
     whatever number of crucial technical factors
     exist, any one of which <b>might</b> fall _outside_
     the hypothesis space covered by human learning,
     (and/or capability) and thus stay unsolved.
  </p>
  <p>
     However, proving AGI alignment to be impossible,
     on the other hand,
     is conceptually very much easier.
     Ie; to show that something would not be possible,
     even as a lower-bound statistical guarantee,
     it is only required to proving that,
     for at least one of the crucial technical factors,
     that that specific selected
     and known necessary factor
     (out of however many unknown others)
     cannot be constantly present alongside
     with another (any other) also crucial
     known necessary factor,
     that is also needed for humans to stay alive.
     This is a negatively specified hypothesis.
  </p>
  <p>
     Our work is to suggest and review
     a series of arguments (developed multiple ways)
     for why technical alignment
     is not possible, now or ever,
     since some misaligned code variants
     are not detectable (in practice or principle)
     nor are their effects containable.
     ie; there is a 0% chance of anyone
     overcoming the problematic contradictions
     at any point in the future,
     <b>regardless</b> of whatever other constraints
     might later be identified, as also necessary
     to AGI alignment and/or safety.
  </p>
  <p>
  :drn
     > Your argument reads in the general pattern of:
     > 'convince the reader that certain problems exist'.
     > That is not helpful; we do not need more of that.
  </p>
  <p>
       Your opinion has been noted.
       What motivates that opinion?
  </p>
  <p>
     > What I am interested in is: 'convince the reader
     > that certain types of solutions to AGI problems
     > are worth trying, experimenting with, and supporting'.
  </p>
  <p>
     > As an engineer, I want to actually solve problems,
     > not create more of them.
     > The former is more fun; the latter, not so much.
  </p>
  <p>
     > Besides, if we are not even suggesting
     > that something is worth working on,
     > how do we expect anything to actually get done?.
     > No effort means no real progress in the world.
  </p>
  <p>
       Not all problems can be solved.
  </p>
  <p>
       If a given specific given problem has been shown,
       via valid reason/logic and sound argument,
       to be actually and strictly within
       the class of known unsolvable problems;
       then it is simply not actually productive,
       to suggest to other people that they should
       "try to make progress" (on unsolvable problems),
       or that someone should provide unlimited funds
       to employ you to "work" on such issues, indefinitely.
  </p>
  <p>
       Moreover, a lot of less scruplous people
       would simply suggest that something is possible,
       merely so that they can apply for support.
       It becomes, marketing, basically, to get NSF grants.
       Or maybe they think OpenPhil maybe a better option?
  </p>
  <p>
       Knowing otherwise, I will not pretend
       that there are maybe solutions
       available to the technical alignment problem,
       if there are consistent and sound arguments
       for why technical alignment is not solvable.
  </p>
  <p>
       In such circumstances, it is not modest to imply
       that progress can be made on all types of problems.
  </p>
  <p>
       And moreover, a specific unwillingness
       to even look at and/or actually consider
       the formal basis of the impossibility results
       (due to some kind of motivated opinion heuristic)
       is a kind of <b>group</b> intellectual dishonesty,
       especially when the precautionary principle
       is to be applied strongly in any area
       that has been clearly identified as
       a known planet-terminal x-risk.
  </p>
  <p>
       As a practicing large scale systems engineer
       with a number of successful deployed projects,
       I can personally relate with the good feeling
       of being able to analyze a problem
       into its constituent parts, in a neat way,
       so as to make a system's behavior predictable,
       and thusly of the satisfaction of constructing
       an actual working practical novel solution,
       something real people can use, on the ground,
       in their real lives.
  </p>
  <p>
       Doing the seemingly impossible, the miraculous,
       while knowing and using all manner of arcane forces,
       (and while also having zero inhibitions in so doing),
       is something of a dream --
       a technologist fantasy.
  </p>
  <p>
       Unfortunately, conventions of provincialism live large.
       In a world of 'short optimization cycles',
       the saving of countless lives in some long future,
       in some completely utterly foreign people,
       living in a truly far and distant land,
       counts for exactly nothing here and now, today,
       (no corporate profits or marketing gains)
       though of course, it very much <b>should</b>.
         (ie; there is no one to thank you now,
         while you still life, and they, then,
         have not yet even begun their own lives).
  </p>
  <p>
       As such, I am also well aware of the
       responsibilities that <b>all</b> engineers have
       (and software development is _not_ excepted)
       to account to/for the real harms/risks
       to/for/in <b>all</b> of the (also future) lives
       affected by those (proposed) systems.
       It is not just about benefits,
       it is also always, about costs and risks,
       and whether,
       when considered in a socially responsible way,
       the "trade" is worth it, in community,
       for <b>everyone</b> involved (ie; _not_just_
       the investors trying to make a short term profit
       from your willing paid for corporate efforts).
         Example; Social media software in apps on phones
         has had clear damaging overall effects
         on culture, democracy, and civilization.
  </p>
  <p>
       Unfortunately, not all problems can be solved;
       not all dreams can be made real (in <b>this</b> universe).
       Sometimes, at a profound level, you run up against
       a deeper principled/fundamental contradiction
       and therefore a solution to that problem
       is not possible (in <b>any</b> universe).
  </p>
  <p>
       For some examples:.
  </p>
  <p>
         - where in abstract math:.
           - Simultaneous consistency and completeness
           of mathematical models
           cannot be solved for (Godel's theorems).
  </p>
  <p>
         - where in distributed computing theory:.
           - Simultaneous message content consistency,
           and continuous availability of messages
           and tolerance to partitions
           cannot be solved for at once
           in a distributed data network
           (@ CAP theorem https://en.wikipedia.org/wiki/CAP_theorem).
  </p>
  <p>
         - where in physics/thermodynamics:.
           - that perpetual motion cannot be solved for.
  </p>
  <p>
       Even more unfortunately, in the past, historically,
       it often took a decade or more (maybe centuries)
       for experts of a given field of study/practice
       to come around to realizing that the problem,
       as defined by them, that they were trying to solve,
       is/was not (ever) actually solvable at all.
       I hope this does not happen to the "AI Safety Community" --
       particularly as they continue to move forward
       to constructing any number of "believable rationales"
       about our "not actually" making slow doom devices
       (in the form of superintelligent machines, etc).
  </p>
  <p>
         Note; that the majority of the researchers
         in the current AI x-safety community
         are not themselves directly working on
         constructing "slow-doom devices" (AGI/APS).
         Instead they are accidental tacit <b>enablers</b>.
         By engaging in so many "what if" senarios,
         they provide the ready means for executives,
         product marketing specialists, to 'spin'
         design and development researchers,
         to believe the idea that "eventually,
         we will find a way" (to great value/profits!).
         It mistakenly facilitates people at corporations
         into falsely believing, and feeling more confident,
         that they can convince themselves and others
         that they can legitimately 'bring to market'
         these 'slow doom devices', because they are
         seemingly (though not actually) safe/aligned.
         Hence, the mass self-deception continues.
  </p>
  <p>
  :dv4
     > I suggest, where for anything you write, and/or present,
     > you should avoid the 'I am going to disprove X' framing
     > and instead use the 'I am going to argue for Y' framing.
     > People respond better to positive things (carrots)
     > more than they respond to negative things (sticks).
  </p>
  <p>
       How about something given in the form of:.
          "I am going to argue in favor of
          accepting the _'AGI_not_safe_at_all'_ argument,
          and to therefore also be in favor of
          <b>NOT</b> allowing _any_ AGI research and/or builds"?.
  </p>
  <p>
       However, we are suggesting a kind of 'not-action',
       (ie; do not build the doom/death making machine(s)),
       as a kind of verified/verifiable formal practice.
       Does that count as sufficiently positive/marketable?.
       Is it ethical for everyone to avoid meeting
       the publicist of an unpopular negative truth?
       Is it ever valid to shoot the messenger?
  </p>
  <p>
       Most tech people agree, where for technology, apps, etc --
       Where for anything that interacts with humans at all --
       that no one can make <b>any</b> system or device (AGI or not)
       a full 100% safe (or 'aligned' with the makers interest).
       There will always be sufficiently smart motivated hackers
       and/or sufficiently stupid humans (end users)
       who will always be making critical mistakes,
       which disable fail-safes, security protocols,
       and enable extortion/stealing of (community) value
       from any app or service.
  </p>
  <p>
       This is not our concern here, even though some people,
       somehow, still hold out the impossible dream
       that superintelligent AGI will allow the building
       of cyber-physical systems that are "100% safe",
       or "incorruptible", and/or "perfectly aligned".
       Of course, this false dream is never going to happen,
       at least not for any reasonable definition
       of "safe" or "aligned" that actually has meaning.
       (ie; assuming that you are not also in some sort of
       weird techno-utopian singulartarian fantasy,
       with hopes of being a living-forever billionaire).
  </p>
  <p>
       Yet our concern is not just about proving
       that AGI systems cannot be exactly 100% safe --
       ie; about some sort of silly 'confidence' nit-pick
       that there is "a real difference" between
       'actually safe enough for practical use',
       and the irrational "absolute perfection 100% safe".
       This is not an argument against 100% fail-safe alignment;
       any 100% fail-safe alignment is impossible in any case.
  </p>
  <p>
       Our claim is about showing that AGI/APS cannot be,
       over the long term, eventually and for sure,
       anything other than actually 100% harmful --
       ie, not safe at all, not for <b>anything</b>,
       not for even ridiculously bad notions of "safe",
       where the bar is so low as to be unrecognizable
       to most forms of popular human common sense.
  </p>
  <p>
       Unfortunately, this result is also rather negative.
       I simply do not see how any sort of marketing "spin"
       can help here.  We need to stop investing/wasting
       time and resources on the actually impossible,
       and/or on delusions that it could ever be
       any other way at all, no matter how utopian
       and/or hyped the AGI benefit outcome fantasies
       were at one time.
  </p>
  <p>
       Getting something from nothing
       (getting benefit without effort)
       is simply another perpetual motion delusion --
       no matter how hopeful
       the researchers or investors might be
       or what buzzword terms are suggested
       along the a way to yet more lost capital.
  </p>
  <p>
       Of course, no investor will ever thank us
       for saving the resource, for helping them to avoid
       near total embarrassment and eventual total ruin.
       Non-investments make no money.
  </p>
</body>
</html>