-
Notifications
You must be signed in to change notification settings - Fork 1
Expand file tree
/
Copy path01-Superintelligence-Safety-Question-And-Answer.html
More file actions
2308 lines (2305 loc) · 91.1 KB
/
01-Superintelligence-Safety-Question-And-Answer.html
File metadata and controls
2308 lines (2305 loc) · 91.1 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!DOCTYPE html>
<html lang="en">
<head>
<!-- Basic Meta Tags -->
<meta charset="UTF-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<!-- SEO Meta Tags -->
<meta name="description" content="Comprehensive AGI Risk Analysis">
<meta name="keywords" content="agi, risk, convergence">
<meta name="author" content="Forrest Landry">
<meta name="robots" content="index, follow">
<!-- Favicon -->
<link rel="icon" href="https://github.githubassets.com/favicons/favicon-dark.png" type="image/png">
<link rel="shortcut icon" href="https://github.githubassets.com/favicons/favicon-dark.png" type="image/png">
<!-- Page Title (displayed on the browser tab) -->
<title>Comprehensive AGI Risk Analysis</title>
</head>
<body>
<p>
TITL:
</p>
<p>
<b>Superintelligence Safety Question and Answer</b>
<b>By Forrest Landry</b>
<b>Oct 6th, 2022</b>
</p>
<p>
ABST:
</p>
<p>
- as expressing ideas as to why general AI/APS
is inherently, in principle, un-containable
and therefore, in the long term, inherently unsafe.
</p>
<p>
TEXT:
</p>
<p>
- where listing some common associated acronyms:.
- "AGI"; as Artificial General Intelligence.
- "APS"; as Advanced Planning and Strategically aware System(s).
- "MAI"; as a Misaligned/mis-configured Artificial Intelligence.
- "MIA"; as a Misaligned/mis-configured Intelligent Agent/agency.
- "AAA"; as Adverse Artificial Agency/agent.
- "AIA"; as an Adverse Intelligent Agent.
</p>
<p>
:hvu
> - ?; why is it important to consider
> formal impossibility proofs of long term
> AGI/APS/Superintelligence safety?.
</p>
<p>
- that even a/one single formal complete proof
(for whatever reason, on whatever basis)
of inherent AGI non-alignment, non-safety
has several significant implications,
particularly in regards to present and future efforts.
</p>
<p>
- that we should (therefore) determine if:.
</p>
<p>
- 1; <b>if</b> AGI cannot ever be safe,
or made safe, or forced to be aligned, etc.
- as that we should be very very clear
if there is an inherent existential risk
of long term terminal total extinction
of all of humanity, of all of life,
inherently associated with super-intelligence --
with AGI/APS usage --
of any type,
due to the artificiality itself,
regardless of its construction, algorithm, etc.
</p>
<p>
- 2; <b>if</b> the real risks, costs, harms
of naively attempting using/deploying any AGI/APS
will always be, in all contexts,
for any/all people (any/all life)
inherently strictly greater than
whatever purported benefits/profits
might falsely be suggested
"that we will someday have".
- ie, where we are not deluding ourselves
with false hype/hope.
</p>
<p>
- 3; <b>if</b> any and all efforts
to develop new or "better" tools
of formal verification of AGI safety
are actually pointless.
- ie; that it does not help us
to have researchers spending time
chasing a false and empty dream
of unlimited profits and benifits.
</p>
<p>
- ?; why would anyone want to be 'that person'
who suggests investing hundreds of thousands
of engineer man-years
into the false promise of obtaining AGI safety,
when a single proof of impossibility --
one person working a few months
on their own for nothing --
could make all of that investment instantly moot?.
- that any investment into attempting
to develop long term AGI safety
is a bit like investing in "perpetual motion machines"
and/or "miracle medicines", etc.
- as a significant opportunity cost
associated with dumping hundreds of millions
of dollar equivalent resources and capital
to buy a lot of wasted time and effort.
</p>
<p>
:hxq
> - ?; why is the notion of complexity/generality
> and/or of self modification (recursiveness)
> important to superintelligence safety considerations?.
</p>
<p>
- that the notion of 'AI'
can be either "narrow" or "general":.
</p>
<p>
- that the notion of '<b>narrow AI</b>' specifically implies:.
- 1; a single domain of sense and action.
- 2; no possibility for self base-code modification.
- 3; a single well defined meta-algorithm.
- 4; that all aspects of its own self agency/intention
are fully defined by its builders/developers/creators.
</p>
<p>
- that the notion of '<b>general AI</b>' specifically implies:.
- 1; multiple domains of sense/action.
- 2; intrinsic non-reducible possibility for self modification;.
- 3; and that/therefore; that the meta-algorithm
is effectively arbitrary; hence;.
- 4; that it is _inherently_undecidable_ as to whether
<b>all</b> aspects of its own self agency/intention
are fully defined by only its builders/developers/creators.
</p>
<p>
- where the notion of 'learns'
implies 'modifying its own behavior'
and 'adapts' implies 'modifying its own substrate';
that the notion of 'learning how to learn'
(the capability of increasing its capability)
can directly imply (cannot not imply)
modifying its own code and/or substrate.
- that/therefore the notion/idea of 'sufficiently complex'
includes (cannot not include) some notion of
'can or does modify its own code/substrate';.
- that the notion of 'general'
can/must eventually include
modifying its own code at any (possible) level.
- ie; as including at the level of substrate,
(how it is built; possible changes to it
ambient operating conditions, optimization, etc)
though <b>not</b> including the level of the
regularity of the lawfulness of physics
(ie, as actually impossible in practice).
</p>
<p>
- that the notion of 'generality',
when fully applied to any AI system,
will very easily result in that 'general AI'
to also being able implement and execute
arbitrary programs/code (ie; as learned skills
and capabilities, as adapted to itself).
- where 'arbitrary' here means 'generality',
in that it is not necessary to bound
the type or kind or properties of the
potential future program(s) that the AGI
could potentially execute,
<b>except</b> insofar as to indicate at least
some finite (though maybe very large) limits
on the size, time, and/or energy,
that is available to run/execute it.
</p>
<p>
- that/therefore a/any/the/all notions of AGI/APS,
and/of "superintelligence" (AAA, AIA, MIA, MAI),
is/are for sure 'general enough'
to execute any (finite) program,
and/or to be/become entangled with unknown programs,
and/or to maybe self modify,
so as to execute/be/become unknown programs.
</p>
<p>
- as that its own program/code/algorithm
becomes increasingly unknown
and potentially unknowable, to any observer --
inclusive of whatever monitoring,
control, corrections systems/programs,
we might have attempted to install in advance.
</p>
<p>
- where considering the Church Turing Thesis
(and ongoing widely extensible and available results
in the field of computational science),
that the threshold needed to obtain
"general computational equivalence"
is very very low.
- as that nearly anything that implements
and/or "understands" or responds to
any sort of conditional logic,
of doing and repeating anything
in some sort of regular
or sequential order,
already implements all that is needed
for general algorithmic computation.
- moreover; embedding or interpreting
one language, process, program, model, or algorithm
within the context of some other process
language, model, algorithm or program, etc --
ie, the notion of 'virtualization'
is used in comp-sci all the time.
</p>
<p>
- however; where/rather than emulating or virtualizing
some arbitrary algorithm within some aspect of
the general capabilities of the general AI;
that a general AI could as easily modify its own code
and programming to directly incorporate and integrate
that arbitrary algorithm.
</p>
<p>
- therefore, that it is inherent in the nature of AGI
that we cannot, even in principle,
know anything in advance about
what code will be running exactly
in <b>association</b> with that AGI,
or as an explicit part of that AGI
(as incorporated into it, at its own election,
at some future point, due to some unforeseen
future circumstances, due to unknown possible
environmental changes and/or unforeseen states,
and/or unknown/unexpected interactions
between the AGI system and its environment,
and/or other people, agents, and AGI systems,
etc, etc);.
- ^; then/that/therefore,
considerations and limits inherently
associated with the Rice Theorem
are/become fully applicable/relevant.
</p>
<p>
- that the class of all possible AGI algorithms
is strictly outside of the class of programs
for which prediction methods are possible.
- as that not even <b>one</b> AGI system
will ever be fully within
the class of verifiable programs.
</p>
<p>
- that there is therefore zero suggestion
that there is any possibility at all
that any purported formal safety verification technique,
now known, or even which, in principle,
could be ever be known, at any future time,
could be applied to assure
the safety/alignment
of any actual AGI system.
</p>
<p>
- where for systems that have specific,
well defined, and unchanging codebase/algorithms,
and for which we can ensure
that such systems never have
complex interactions with its environment
which result in some form of
self mutation, adaptation, optimization;
and where we can fully characterize
the possible allowable ranges of inputs,
that we can, and routinely do,
characterize something of the ranges of outputs.
</p>
<p>
- as equivalent to the observation:.
- where for systems where we can
at least reasonably fully characterize:.
- 1; the range and nature of the inputs.
- 2; the range and nature of the processing,
(of the states and nature of the system itself).
- ^; that 3; it is at least possible,
<b>sometimes</b>, in principle (maybe),
for reasonably simple/tractable/regular systems (only),
to characterize something about
the range and nature of the outputs.
- that nearly all current engineering methods
tend to focus on the selection and use of systems
for which <b>all</b> of these conditions apply,
so that such engineers can at least sometimes
potentially make (if no design mistakes)
systems with known properties
of safety, utility, cost effectiveness, etc.
</p>
<p>
- that predictability works
for some types of specific code --
things written with the specific intention
to be understandable, predictable,
modifiable, updateable, etc.
</p>
<p>
- where for AGI systems of any merit;
that exactly <b>none</b> of these
intractability conditions apply:.
</p>
<p>
- 1; that we know little to nothing about
the actual, in practice, range and nature
of the future inputs of the AGI system
(ie, their complexity and/or physicality
depends on future environmental conditions,
which often changes due to circumstances
outside of developer control).
</p>
<p>
- 2; that we know little to nothing about
the range and nature of the processing
internal to the AGI, insofar as it will
be the result of past learnings, inclusive
of possible indirect inheritances
(ie; via arbitrary code and data transfers)
and/or also due to integration of such
foreign code/data, and/or emulation of same,
etc, as occurring over long intervals of time).
</p>
<p>
- 3; that the inherent internal complexity
is well above the threshold of Turing Equivalence
and as such, the overall system is not at all
simple, tractable, or regular,
for any real/reasonable meanings of these terms.
</p>
<p>
- that self generating/authoring/modifying AGI code
will not likely have any of the needed features,
to establish any kind of general predictability,
and thus, reasoning about the future unknown states
and variations of that potential/possible code
is a lot more like the "any unknown arbitrary"
chunk of code case as named in the Rice Theorem,
than it is the "known specific code" case
sometimes posited as a counterexample.
</p>
<p>
:j2j
> - ?; why is there any necessary association
> between superintelligence and x-risk?.
</p>
<p>
- where needing to distinguish 'localized risks'
(ie, moderate levels of known acceptable risks)
from 'global catastrophic risk' and/or (worse)
'exestential risks of deployed technology':.
</p>
<p>
- where define; moderate risk systems:.
- as referring to systems for which
all possible error/problem outcome states
have purely local effects.
</p>
<p>
- where define; high x-risk systems:.
- as referring to systems for which
1; at least some possible error/problem outcomes
have effects and dynamics (catalytic actions)
that extend well beyond the specific location
(and cannot be made to not so extend)
where the error/problem 1st occurred,
<b>and which also</b> 2; involve very strong
change energies, which are well beyond
the adaptive tolerances of the systems
already in place (life already living)
in those (diverse/distributed) locations.
</p>
<p>
- where for moderate risk systems,
(as distinguished from high x-risk);.
- that there is a very wide class of code
a lot of which is already in current use
for which the behavior is 'un- modelable'
by anything that is simpler than
running the actual program itself.
- if running the program is unsafe,
then running the program is unsafe.
- where for most of the code being run,
and because the consequences of failure
are reasonably local in time and space,
that this non- modelability is validly
not seen as a problem/risk.
- that running something
in non-catalytic environments
has the implicit implication that
even the worst outcome is limited
to the local failure of equipment,
or, at most, strictly local damage/destruction.
</p>
<p>
- where unfortunately; that there are
inherent catalytic systemic aspects
inherent in AGI/superintelligence itself.
- where for more; (@ see this essay https://mflb.com/ai_alignment_1/aps_detail_out.html).
- that these extend the risk profile of AGI
into the 'non-local, strong changes' category,
and therefore also into high x-risk category.
</p>
<p>
- where for any system
that have high x-risk factors;.
- note; 'high' both in terms of magnitude/scope,
and also 'high' in terms of probability to occur.
- ie; when considering systems
with known and acknowledged potential
for existential/terminal catastrophic risk.
- that the action of trying/attempting
to determine experimentally,
by trial and error,
whether some program has some property like safety
is deeply irresponsible,
in the worst possible way.
</p>
<p>
- that it does not matter
how well defined,
or how specific,
our knowledge may be
of the exact sequence of specific instructions;
the un-knowability of the risk and alignment profile
for a very large class of actual programs
remains inherently unknown and unknowable.
</p>
<p>
- for example; that one does not experiment with
dangerous 'gain of function' infectious virus research
when out in the open, in unprotected spaces!.
- as similar to worries
that another Covid might happen.
</p>
<p>
- that the behavior of simple systems
with non-catalytic effects
is very different, in risk profiles,
than even fairly simple systems
with inherent auto-catalytic effects.
- ie; conceptually speaking,
nuclear explosive devices
are fairly simple in their overall concept --
the fundamental algorithm describing them
can often be described with just a small
set of equations and sequential processes.
- that the latter is very unsafe,
despite the apparent deceptive simplicity
and finiteness of the algorithmic code.
</p>
<p>
- where engineers/researchers
are in practice concerned with
only systems/algorithms/optimizations
with a possibility space
of only local effects;
that they do not usually have to consider
whether a given program will ever halt
mostly because whether or not
the program halts does not matter --
they can always interrupt or quit the program
and/or pull the plug in the worst case,
or in case of any real difficulty.
- that the consequences of not halting
are not problematic in most cases.
</p>
<p>
- where for AGI, where for systems
for which it is entirely unclear if/when
there will ever be any possibility
of stopping/halting them,
and where the risk of not-stopping, etc,
is roughly equivalent to
all future people dying,
then it becomes a lot more important
to consider things like 'halting' and 'safety'.
- as that the space of all possibilities,
and the space of the potential of
even the possibility of being able
to know/predict any bounds at all
on the future states/possibilities
becomes critically important.
</p>
<p>
:j5y
> - ?; is there any way that formal methods,
> at least in principle, sometimes,
> could maybe help with at least some aspects
> of the design of safe AGI systems?.
</p>
<p>
No.
</p>
<p>
- that there are some specific and useful algorithms
for which no one expects
that there will <b>ever</b> be any techniques
of applying formal methods
so as to be able to establish
some specific and well defined property X.
</p>
<p>
- that formal verification techniques
are generally only applied
to smaller and more tractable systems and algorithms.
- that there will always be a very large class
of relevant and practical programs/systems
for which the methods of formal verification
simply cannot, even in principle, be used.
</p>
<p>
- that 'formal verification' cannot be
used for/on every programs/systems.
</p>
<p>
- where from other areas of comp-sci research;
that there are very strong indications
that once the inherent Kolmogorov complexity
exceeds a certain (fairly low) finite threshold,
that the behavior of the program/system
becomes inherently intractable.
</p>
<p>
- where considering a 'property of the system'
as basically some abstraction over
identifying specific subsets
of accessible system states;.
- that no specific property
of such a complex system
can be determined.
</p>
<p>
- that these and other understood outcome(s)
(regarding progam complexity at runtime,
limits of formal verification, etc)
are due to a number of
well established reasons in comp-sci
other than just those associaed with
the Rice Theorem.
- examples; considerations of O-notation,
Busy Beaver and Ackermann functions,
what minimally constitutes
Church Turing Equivalence, etc.
</p>
<p>
- that the formal methods of proof are only
able to be applied to the kind of deductive reasoning
that can establish things like impossibility --
they are not at all good at establishing things
like possibility, especially in regards to AGI safety.
</p>
<p>
:j7u
> - ?; can anyone, ever, at any time,
> ever formally/exactly/unambiguously/rigorously prove
> at any level of abstraction/principle,
> that something (some system, some agent, some choice)
> will <b>not</b> have 'unintended consequences'?.
</p>
<p>
No; not in practice, not in the actual physical universe.
</p>
<p>
However, to see this,
there are a few underlying ideas
and non-assumptions, we will need to keep track of:.
</p>
<p>
- that the domain of mathematics/modeling/logic
and the domain of physics/causation (the real world)
are <b>not</b> equivalent.
- ie; that the realm of "proof" is in pure mathematics,
as a kind of deterministic formality, verification, etc;
whereas the aspects of system/agent/choice/consequence
are inherently physical, real, non-deterministic
as are the ultimate results of concepts and
abstractions like "safety" and "alignment".
</p>
<p>
- that the physical causative universe
has hard limits of knowability and predictability,
along with practical limits of energy, time,
and space/memory.
- ref; the Planck limit of a domain,
the Heisenberg uncertainty principle,
etc.
</p>
<p>
- that the real physical universe
is <b>not</b> actually closed/completed
in both possibility and probability
(even though, for reasonableness sake,
we must treat them mathematically
<b>as if</b> that was the case).
</p>
<p>
- that not all possibilities
can be finitely enumerated.
</p>
<p>
- that the summation of probability
over each of the known possibilities
cannot always be exactly calculated.
</p>
<p>
- where for some explicit subset
of the available possibilities;
that at least some of these probabilities
cannot be shown to be exactly zero,
or that the summation of all probabilities
will sum to exactly unity.
</p>
<p>
- that any specific 'intention'
cannot be exactly specified and specifiable,
at all levels of abstraction,
for <b>any</b> and <b>every</b> agent
that could be involved.
</p>
<p>
- where in a more general way, even just within
the domain of just deterministic mathematics,
that the non-provability non-prediction of safety
and consequence is the result of the Rice Theorem:.
</p>
<p>
- that there is no single finite universal
procedure, method, processes or algorithm
(or even any collection of procedures, etc)
by which anyone can (at least in principle)
identify/determine (for sure, exactly)
whether some specific program/system/algorithm
has any particular specific property,
(including the property of 'aligned' or 'safe'),
that will for sure work (make a determination)
for every possible program, system, or algorithm.
</p>
<p>
- that there are some limits to the Rice theorem:.
</p>
<p>
- that the Rice Theorem does <b>not</b> claim
that there are no specific procedures,
(processes, methods, or algorithms, etc)
by which one could characterize some well defined
(usually fairly simple) specific finite algorithm
as having some specific property.
</p>
<p>
- for example; that it might be possible,
using some (as yet unknown) procedure,
to identify that some narrow AI is safe,
for some reasonably defined notion of 'safe'.
</p>
<p>
- that what the Rice Theorem <b>does</b> claim
is that whatever procedures are found
that can maybe work in some cases,
that there will always be some other
(potentially useful) programs/systems
that inherently cannot be characterized
as having any other arbitrary specific desirable property,
even if that property is also well defined.
- as that there is no way to establish
any specific property as applying to
every possible useful program/system.
</p>
<p>
- that a/any generally-capable machine(s)
(ie; especially ones that learn and/or reasons
about how to modify/optimize its own code)
will of course <b>not</b> attempt to run
<b>all possible</b> algorithms
that are theoretically computable
on a universal Turing machine
(ie; any arbitrary algorithm).
- ie; will not take non-optimal actions
that do not benifit anything at all.
</p>
<p>
- however, when assuming the machine
continues to learn and execute (does not halt);
it will for sure eventually run
some specific selected subset
of all the possible useful algorithms
that are computationally complex
(ie; in the P vs NP sense).
- that this is for sure enough
to be actually computationally irreducible,
in the sense of not being future predictable
<b>except</b> through the direct running of that code.
</p>
<p>
- thus; the mathematical properties
of that newly integrated code
can often not be determined
for any such newly learned algorithm
until computed in its <b>entirety</b>
over its computational lifetime
(potentially indefinitely).
- ie; that such new integrated algorythms
cannot be decomposed
into sandboxed sub-modules
which are also simple enough
to fully predict what the effects of
their full set of interactions will be.
- in effect; that it cannot be known (in advance)
if such a learned algorithm
will reveal output or state transitions
which would be recognized as unsafe
at a later point --
for instance, after the equivalent of
a few decades of computation on a supercomputer.
</p>
<p>
- Nor can we know in advance
whether the newly integrated algorithm
would halt before that point of non-safety
(and therefore knowably remain safe,
ie; unless or until triggering
the (state-adjusted) execution
of another algorithm on the machine
that turns out to be unsafe).
- while Rice Theorem makes formal assumptions
that do not <b>precisely</b> translate to practice
(regarding arbitrary algorithms and
reliance on halting undecidability),
there are clear correspondences
with how algorithms learned by AGI
would have undecidable properties in practice.
</p>
<p>
- that we will never have a procedure
that takes any possible chunk of code
and accurately predicts the consequences
of running that specific selected code.
- as the basic implication of the Rice Theorem.
</p>
<p>
- while it is impossible to design a procedure
that would check any arbitrary program for safety;
that we can still sometimes reason about the safety properties
of some types of much simpler computer programs.
</p>
<p>
- that <b>maybe</b> we might be able to design
some <b>narrow</b> AI systems so that they can
<b>maybe_sometimes</b> behave predictably,
in some important relevant practical aspects
(though probably not in all aspects,
at all levels of abstraction,
without also actually running the program --
which no longer counts as 'prediction').
</p>
<p>
- however, there will <b>always</b> be a strictly larger
class of programs (inclusive of most narrow AI systems)
whose behavior is inherently unpredictable
in advance of actually running the program,
than the class of programs
which are simple and tractable enough
for which the output of that program
could be predicted in advance.
</p>
<p>
- where it is possible to sometimes predict
the results of some limited and select subset
of all human-made useful programs/tools;
that this does not imply anything important
with regards to cost/benefit/risk assessments
of any future proposed AGI deployments.
</p>
<p>
:jac
> - that some programs are perfectly predictable.
> - cite example; the print "hello world" program.
> - ?; would any argument of general AI non-safety
> simply claim too much, and therefore be invalid?.
> - ?; does your argument mistakenly show
> that all programs are unpredictable?.
</p>
<p>
No.
</p>
<p>
- that the proof does not conflate
simple finite programs
with complex programs.
</p>
<p>
- where simple programs
have clearly bounded states of potential interaction.
- as maybe having known definite simple properties.
- where complex programs can have complex interactions
with the environment (users/etc).
- that claims about properties of complex programs
are not so easily proven, by any technique.
- when/once unknown complex unpredictable interactions
with the environment are also allowed,
then nearly all types of well defined properties
become undecidable.
- ie; properties like usability, desirability,
salability, safety, etc.
</p>
<p>
- even where in a fully deterministic world,
such as that of mathematics or of pure computer science;.
- that it takes very little effort
to write a program that is sufficiently complex
that its specific sequence of outputs
are inherently unpredictable,
even in principle,
by any process or procedure --
by any other method --
other than actually running the program
and recording its outputs, as discovered.
</p>
<p>
- that the action of creating something actual,
in nature, using real physics
that has real uncertainties built in,
means that creating things
whose outcome is unpredictable
(inherently, and in principle,
once stats is factored out)
is really quite easy, and moreover,
happens more often than not.
- that creating things in nature
(in the real world)
whose outcomes are <b>predictable</b>
takes significant deliberate effort.
- as the actual work of engineering.
</p>
<p>
- that the actual real world is not a computer model;
that theory is <b>not</b> practice.
- ie; no one has proven
that we actually live in
a computer simulation --
as a proxy for a perfectly deterministic universe.
- that anyone attempting to make the claim
that "the real world" and "computer modeling"
are actually and fundamentally strictly equivalent,
would need to produce some really high class
extraordinary evidence.
- that until such incontrovertible
empirical or logical demonstration
is actually obtained, provided, etc,
then the absence of this assumption,
that the real world is not a model,
that they are maybe somehow different,
will be upheld.
</p>
<p>
- that the action of treating both classes of AI
as if they were the same,
and/or could be treated the same way,
is the sort of informality and lack of discipline
that, when applied in any safety-critical field,
eventually ends up getting real people killed
(or in this case, maybe terminates whole planets).
</p>
<p>
- that the only people
who usually make these sorts of mistakes
(whether on purpose or actually accidentally)
tend to be non-engineers;
ie; the managing executives, social media marketing,
public relations and spin hype specialists,
and/or the private shareholders/owners/beneficiaries
of the systems/products that will eventually cause
significant harm/costs to all to ambient others
(ie; the general public, planet, etc).
</p>
<p>
- that it is entirely possible for multiple
correct proofs to co-exist
within the same mathematical framework.
- when/where both proofs are correct,
that they can co-exist.
- that proving one thing
does not "disprove" another thing
that has already been proven.
</p>
<p>
- where example; that geometry, algebra, etc,
remained useful as tools,
and are still applied to purpose,
on a continuing basis, etc,
<b>despite</b> the development of various kinds
of proofs of impossibility:.
- squaring the circle.
- doubling the cube.
- trisecting the angle.
- identifying the last digit of pi.
- establishing the rationality of pi.
</p>
<p>
- another example:.
- where given the specific proof
that the Continuum Hypothesis,
as a foundational problem,
was actually intractable
(given available axioms)
was <b>not</b> to suggest
that nothing else
in the entire field of mathematics
was provable/useful.
</p>
<p>
- that it was never claimed
by/in any discipline of math
(or any other field of study)
that it would be able to solve
every specific foundational problem.
- similarly; that no one
in the field of formal verification,
(or in the field of AGI alignment/safety)
has made the claim that,
"at least in principle",
that the tools already available
in such such fields of study/practice
'could even potentially solve
all important foundational problems'
that exist within in their field.
</p>
<p>
:jdg
> - ?; therefore; can we at least make <b>narrow</b>
> (ie; single domain, non-self-modifying)
> AI systems:.
>
> - 1; "safe"?.
> - ie; ?; safe for at least some
> selected groups of people
> some of the time?.
>
> - 2; "aligned"?.
> - ie; ?; aligned with
> the interests/intents/benefits/profits
> of at least some people
> at least some of the time?.
</p>
<p>
Yes; at least in principle.
</p>
<p>
- as leaving aside possible problems
associated with the potential increase
of economic choice inequality
that might very likely result.
- as distinguishing that 'aligned'
tends to be in favor of the rich,
who can invest in the making
of such NAI systems
to favor their own advantage/profits,
and that the notion of "safe"
tends to be similarly construed:
ie, safe in the sense of 'does not hurt
the interests/well-being of the owners,
regardless of longer term harms
that may accrue to ambient others
and/or the larger environment/ecosystem.
</p>
<p>
- that declaring that general AI systems
are inherently unsafe over the long term
is <b>not</b> to suggest that narrow AI systems
cannot have specific and explicit utility
(and safety, alignment) over the short term.
</p>
<p>
- that there are important real distinctions
of the risks/costs/benefits
(the expected use and utility
to at least some subset of people)
associated with making and deployment of narrow AI
vs the very different profiles of risks/costs/benefits
associated with the potential creation/deployment
of general AI.
</p>
<p>
- that a proof of the non-benefit and terminal risk
of AGI is not to make any claims regarding narrow AI.
</p>
<p>
:jfc