Steve Omohundro: Autonomous Technology and the Greater Human Good

Extended Abstract: Next generation technologies will make at least some of their decisions autonomously. Self-driving vehicles, rapid financial transactions, military drones, and many other applications will drive the creation of autonomous systems. If implemented well, they have the potential to create enormous wealth and productivity. But if given goals that are too simplistic, autonomous systems can be dangerous. We use the seemingly harmless example of a chess robot to show that autonomous systems with simplistic goals will exhibit drives toward self-protection, resource acquisition, and self-improvement even if they are not explicitly built into them. We examine the rational economic underpinnings of these drives and describe the effects of bounded computational power. Given that semi-autonomous systems are likely to be deployed soon and that they can be dangerous when given poor goals, it is urgent to consider three questions: 1) How can we build useful semi-autonomous systems with high confidence that they will not cause harm? 2) How can we detect and protect against poorly designed or malicious autonomous systems? 3) How can we ensure that human values and the greater human good are served by more advanced autonomous systems over the longer term?

1) The unintended consequences of goals can be subtle. The best way to achieve high confidence in a system is to create mathematical proofs of safety and security properties. This entails creating formal models of the hardware and software but such proofs are only as good as the models. To increase confidence, we need to keep early systems in very restricted and controlled environments. These restricted systems can be used to design freer successors using a kind of “Safe-AI Scaffolding” strategy.

2) Poorly designed and malicious agents are challenging because there are a wide variety of bad goals. We identify six classes: poorly designed, simplistic, greedy, destructive, murderous, and sadistic. The more destructive classes are particularly challenging to negotiate with because they don’t have positive desires other than their own survival to cause destruction. We can try to prevent the creation of these agents, to detect and stop them early, or to stop them after they have gained some power. To understand an agent’s decisions in today’s environment, we need to look at the game theory of conflict in ultimate physical systems. The asymmetry between the cost of solving and checking computational problems allows systems of different power to coexist and physical analogs of cryptographic techniques are important to maintaining the balance of power. We show how Neyman’s theory of cooperating finite automata and a kind of “Mutually Assured Distraction” can be used to create cooperative social structures.

3) We must also ensure that the social consequences of these systems support the values that are most precious to humanity beyond simple survival. New results in positive psychology are helping to clarify our higher values. Technology based on economic ideas like Coase’s theorem can be used to create a social infrastructure that maximally supports the values we most care about. While there are great challenges, with proper design, the positive potential is immense.

Roman Yampolskiy: Reward Function Integrity in Artificially Intelligent Systems

Extended Abstract: In this paper we will address an important issue of reward function integrity in artificially intelligent systems. Throughout the paper, we will analyze historical examples of wireheading in man and machine and evaluate a number of approaches proposed for dealing with reward-function corruption. While simplistic optimizers driven to maximize a proxy measure for a particular goal will always be a subject to corruption, sufficiently rational self-improving machines are believed by many to be safe from wireheading problems. Claims are often made that such machines will know that their true goals are different from the proxy measures, utilized to represent the progress towards goal achievement in their fitness functions, and will choose not to modify their reward functions in a way which does not improve chances for the true goal achievement. Likewise, supposedly such advanced machines will choose to avoid corrupting other system components such as input sensors, memory, internal and external communication channels, CPU architecture and software modules. They will also work hard on making sure that external environmental forces including other agents will not make such modifications to them. We will present a number of potential reasons for arguing that wireheading problem is still far from being completely solved. Nothing precludes sufficiently smart self-improving systems from optimizing their reward mechanisms in order to optimize their current-goal achievement and in the process making a mistake leading to corruption of their reward functions.

In many ways the theme of this paper will be about how addiction and mental illness, topics well studied in human subjects, will manifest in artificially intelligent agents. We will describe behaviors equivalent to suicide, autism, antisocial personality disorder, drug addiction and many others in intelligent machines. Perhaps via better understanding of those problems in artificial agents we will also become better at dealing with them in biological entities.

A still unresolved issue is the problem of perverse instantiation. How can we provide orders to superintelligent machines without danger of ambiguous order interpretation resulting in a serious existential risk? The answer seems to require machines that have human-like common sense to interpret the meaning of our words. However being superintelligent and having common sense are not the same things and it is entirely possible that we will succeed in constructing a machine which has one without the other. Finding a way around the literalness problem is a major research challenge. A new language specifically developed to avoid ambiguity may be a step in the right direction.
Throughout the paper we will consider wireheading as a potential choice made by the intelligent agent. As smart machines become more prevalent, a possibility will arise that undesirable changes to the fitness function will be a product of the external environment. For example in the context of military robots the enemy may attempt to re-program the robot via hacking or computer virus to turn it against its original designers, a situation which is similar to that faced by human war prisoners subjected to brainwashing or hypnosis. Alternatively robots could be kidnapped and physically re-wired. In such scenarios it becomes important to be able to detect changes in the agent’s reward function caused by forced or self-administered wireheading. Behavioral profiling of artificially intelligent agents may present a potential solution to wireheading detection.
The full paper will address the following challenges and potential solutions: Wireheading in Machines (Direct stimulation, Maximizing reward to the point of resource overconsumption, Killing humans to protect reward channel, Ontological Crises, Changing initial goals to an easier target, Infinite loop of reward collecting, Changing human desires or physical composition, Reward inflation and deflation), Perverse Instantiation, Sensory Illusions – a Form of Indirect Wireheading. Potential Solutions to the Wireheading Problem (Inaccessible reward-function (hidden, encrypted, hardwired, etc.), Reward function resetting, Revulsion, Utility Indifference, External Controls, Evolutionary competition between agents, Learned Reward Function, Making utility function be bound to the real world).

Andras Kornai: Bounding the impact of AGI

Extended Abstract: In the full paper, we discuss a trivial lower bound, and a nontrivial upper bound on the impact of AGI. The lower bound, we argue, cannot be placed below the impact of an exceptional human individual, be their role viewed as positive (say, the appearance of a significant new advocate of non-violence) or as negative, such as the appearance of a new dictator.

Without taking specific action, there is no reasonable upper bound we can firmly estab- lish. The body of the paper is devoted to exploring the following proposition: if we are capable of restricting AGI to the domain of morally sound agents, we can actually take the upper bound to be the same as the lower bound. There are both conceptual and practical difficulties in the way of establishing the implication.

The central conceptual difficulty is that the conventional human definition of ‘morally sound’ is highly debatable, and one can very well imagine situations in which AGIs consider it best, from their moral standpoint, to do away with all of humanity except for one ‘Noah’ family, and start with a clean slate. The central practical difficulty, well appreciated in the current literature, is to guarantee that the premiss is indeed met, especially in the face of AGIs that may be capable of transcending limitations placed on them by their designers.

Ted Goertzel. Minimizing Risks in Developing Artificial General Intelligence

Extended Abstract: This paper posits that elements of human-level or superhuman artificial general intelligence will emerge gradually and unevenly over a period of years rather than appearing full blown in one dramatic event or singularity. This is likely because funding for AI research will go mostly to projects that promise short-term payoffs. Although these projects will focus on narrow goals, achieving them will require that their developers incorporate more general AI capabilities. They will create a new class of AGI experts significantly more advanced than current narrow-AI systems, but without the full range of human-level capabilities. These AGI experts are likely to be in use for a period of years before the much feared development of a superhuman AGI with the capability of modifying itself into new forms and presenting an existential risk to the human species. When this superhuman AGI develops, it will incorporate the capabilities of earlier AGI expert systems, just as the human brain evolved from the brains of more primitive ancestors.

We cannot wait until superhuman AGI is here to begin work on risk control strategies. What we must do, this paper argues, is advance risk protection in step with the development of narrow-AI into AGI expert systems. Doing so will help to minimize the risks of emerging AGI expert systems, risks which may be very substantial. We can also try to insure that the superhuman AGIs of the future build on this foundation and incorporate risk management strategies used by AGI expert systems.

This paper examines risk prevention strategies being developed today in areas where artificial intelligence is relatively advanced. Areas to be highlighted will include: financial trading, medical devices, security monitoring, rumor management, and response to disasters such as massive electrical blackouts.

Financial trading is a good place to begin because artificial intelligence has already surpassed human intelligence in some ways in that field. It is certainly much quicker, executing very large numbers of trades based on complex analyses in small fractions of a second. This enables investors to profit simply by reacting to what large institutional investors have done before human traders have time to react. Trades caused by bugs in the software can also be executed very quickly, and repetitively, leading to sometimes dramatic instability in markets and massive losses by trading firms or other investors. Risk management strategies include circuit breakers, which cut off trading when prices go beyond reasonable founds, and transaction taxes which discourage high volume trading. More generally, the problems with computer trading highlight the issue of speed. AGIs that interact with human systems, or that require human oversight in their operations, may need to be slowed down with artificially imposed constraints.

Heart pacemakers have been regulating human heartbeats for some years, and are programmed to respond to emotional and physical processes in the host. Failure could have life-threatening consequences, yet safeguards have made them safe enough for very widespread use. Diseases such as Parkinson’s disease, thyroid disease and diabetes can be regulated with automated systems that provide a variable flow of chemicals responding to the body’s needs. These technologies may be important steps toward the development of cyborgs. This paper will examine the way in which the risks of these technologies are being minimized, and the ways in which new technologies are projected to improve on them in the future.

An increasing flow of information about private and social life is flowing into police and intelligence offices from surveillance cameras and communication intercepts. Artificial intelligence is increasingly used to process this data and look for patterns that may lead to terrorist or criminal threats or to political or social actions threatening to authorities. The issues raised here are as much ethical as technological, and they will become more acute as the technology improves. The potential dangers of Big Brother systems have been explored extensively in science fiction. This paper will examine the measures that are being developed to manage this problem with current and emerging technologies.
Social networks have allowed information to be diffused to mass publics very quickly. In some cases this has been used to spread rumors designed to cause riots and exacerbate ethnic tensions. Of course, these technologies also can be used to empower democratic protests against authoritarian regimes. The challenge here is how to minimize the harm done by malicious campaigns without inhibiting legitimate free speech.
There are many other areas where artificial intelligence technology is already being applied, and where risk management technologies are being developed and perfected. This paper will review these developments with a view to minimizing the risks from the still largely narrow AI of today and the increasing general AI expert systems of the future.

Miles Brundage. Limitations and Risks of Machine Ethics

Extended Abstract: The gradually increasing sophistication of semi-autonomous and autonomous robots and virtual agents has led some scholars to propose constraining these systems’ behaviors with programmed ethical principles (“machine ethics”). While impressive machine ethics theories and prototypes have been developed for narrow domains, several factors will likely prevent machine ethics from ensuring positive outcomes from advanced, cross-domain autonomous systems. This paper critically reviews existing approaches to machine ethics in general and Friendly AI in particular (an approach to constraining the actions of future self-improving AI systems favored by the Singularity Institute for Artificial Intelligence), finding that while such approaches may be useful for guiding the behavior of some semi-autonomous and autonomous systems in some contexts, these projects cannot succeed in guaranteeing ethical behavior and may introduce new risks inadvertently. Moreover, while some incarnation of machine ethics may be necessary for ensuring positive social outcomes from artificial intelligence and robotics, it will not be sufficient, since other social and technical measures will also be critically important for realizing positive outcomes from these emerging technologies.
Building an ethical autonomous machine requires a decision on the part of the system designer as to which ethical framework to implement. Unfortunately, there are currently no fully-articulated moral theories that can plausibly be realized in an autonomous system, in part because the moral intuitions that ethicists attempt to systematize are not, in fact, consistent across all domains. Unified ethical theories are all either too vague to be computationally tractable or vulnerable to compelling counter-examples, or both. [1,2] Recent neuroscience research suggests that, depending on the context of a given decision, we rely to varying extents on an intuitive, roughly deontological (means-based) moral system and on a more reflective, roughly consequentialist (ends-based) moral system, which in part explains the aforementioned tensions in moral philosophy. [3] While the normative significance of conflicting moral intuitions can be disputed, these findings at least have implications for the viability of building a machine whose moral system would be acceptable to most humans across all domains, particularly given the need for ensuring the internal consistency of a system’s programming. Should an unanticipated situation arise, or if the system were used outside its prescribed domain, negative consequences will likely result due to the inherent fragility of rule-based systems.
Moreover, the complex and uncertain relationship between actions and consequences in the world means that an autonomous system (or, indeed, a human) with an ethical framework that is (at least partially) consequentialist cannot be relied upon with full confidence in any non-trivial task domain, suggesting the practical need for context-appropriate heuristics and great caution in ensuring that moral decision-making in society does not become overly centralized.[4] The intrinsic complexity and uncertainty of the world, along with other constraints such as the inability to gather the necessary data, also doom approaches (such as Friendly AI) to derive a system’s utility function from extrapolation of humans’ preferences. There is also a risk that the logical implications derived from premises in a given ethical system may not be what humans working on machine ethics principles believe them to be (this is one of the categories of machine ethics risks highlighted in Isaac Asimov’s work[5]). In other words, machine ethicists are caught in a double-bind: they must either depend on rigid principles for addressing particular ethical issues, and thus risk catastrophic outcomes when those rules should in fact be broken[6], or they allow an autonomous system to reason from first principles or derive its utility function in an evolutionary fashion, and thereby risk the possibility that it will arrive at conclusions that the designer would not have initially consented to. Lastly, even breakthroughs in normative ethics would not ensure positive outcomes from the deployment of explicitly ethical autonomous systems. Several factors besides machine ethics proper – such as ensuring that autonomous systems are robust against hacking, developing appropriate social norms and policies for ensuring ethical behavior by those involved in developing and using autonomous systems, and the systemic risks that could be arise from dependence on ubiquitous intelligent machines – are briefly described and suggested as areas for further research in light of the intrinsic limitations of machine ethics.

[1] Horgan, T., Timmons, M. (2009) What Does the Frame Problem Tell Us About Moral Normativity?, Ethical Theory and Moral Practice
Volume 12, Number 1 (2009), 25-51, DOI: 10.1007/s10677-008-9142-6
[2] Keiper, A., Schulman, A., The Problem with ‘Friendly’ Artificial Intelligence, The New Atlantis, Number 32, Summer 2011, pp. 80-89.
[3] Cushman, F., Young, L., Greene, J.D. (2010) Our multi-system moral psychology: Towards a consensus view, in The Oxford Handbook of Moral Psychology, J. Doris, G. Harman, S. Nichols, J. Prinz, W. Sinnott-Armstrong, S. Stich, Eds. Oxford University Press.
[4] Gigerenzer, G. (2010) Moral Satisficing: Rethinking Moral Behavior as Bounded Rationality, Topics in Cognitive Science 2 (3):528-554.
[5] Asimov, I. (1950) The Evitable Conflict, Astounding Science Fiction, Street & Smith.
[6] Bringsjord, S. (2009) Unethical But Rule-Bound Robots Would Kill Us All,

Ben Goertzel. GOLEM: Toward an AGI Meta-Architecture Enabling Both Goal Preservation and Radical Self-Improvement

Extended Abstract: A high-level AGI architecture called GOLEM (Goal-Oriented LEarning Meta-Architecture) is presented, along with an informal but careful argument that GOLEM may be capable of preserving its initial goals while radically improving its general intelligence. As a meta-architecture, GOLEM can be wrapped around a variety of different base-level AGI systems, and also has a role for a powerful narrow-AI subcomponent as a probability estimator. The motivation underlying these ideas is the desire to create AGI systems fulfilling the multiple criteria of being: massively and self-improvingly intelligent; probably beneficial; and almost surely not destructive.

Alexey Potapov and Sergey Rodionov. Universal empathy and ethical bias for artificial general intelligence

Extended Abstract: Existing mathematical models of artificial general intelligence usually include optimization of some value function as the main drive of an intelligent agent. However, there is the well-established problem of complex value functions. It is very difficult (if even possible) to specify such value function, which will guarantee safe behavior of the agent. This function must be expressed using high-level concepts, which cannot be available for unbiased (or infant) intelligence. The safe complex value function can be calculated by some developed adult intelligence. But there are different theoretical and practical problems related to the approach of designing adult intelligence (one of them is the symbolic grounding problem and consistency between built-in and acquired knowledge). More appealing and natural approach to AGI is to design some infant (probably biased, but without huge knowledge base) intelligence that can learn and grow up. However, internal value function for this infant intelligence can only be rather simple and unsafe. Consequently, the most direct way to the safe AGI is to rely on external value functions, which are calculated by existing adult intelligent agents (including humans). Thus, the infant intelligence should have some methods for acquiring values of external functions and to reconstruct these functions to become able to calculate them in future (or when external values are not available).

In our report, we discuss several technical possibilities of acquiring external value functions to the intelligent agent. These possibilities include detached communication channel, prior methods of interpretation of sensory data (such as emotion recognition), interpretation of internal value function (such as pain and pleasure) as external value functions during some periods of sensibility. However, the main problem here is not to supply the intelligent agent with the values of external functions. Direct maximization even of the external value function is also unsafe. For example, the intelligent agent may try to force humans to smile or to transmit high values to the detached channel instead of making them happy. Thus, the agent should generalize the obtained values. This should be done in order not to predict future values (as it is done in the conventional models of reinforcement agents), but to reveal hidden factors of external value functions. And these hidden factors should become valuable themselves (i.e. become components of the value system or term in the internally computable value function). Preliminary mathematical model of this process based on the representational minimum description length principle derived from the model of universal induction is proposed. Because representations can be used not only to introduce inductive bias in the incremental learning, but also to introduce priors, one can also specify some “ethical bias” for the universal intelligence within this approach.

To do this, we consider the AIXI model and modify it by introducing specific representation for environment models. These models are represented as mixtures of programs, some of which are arbitrary and others are trying to maximize their value function (as the basic AIXI does). The latter programs are interpreted as models of other agents. Of course, such the mixture model of the environment can be learned by the basic AIXI. However, this representation not only reduces complexity of the mixture model using built-in functions for modeling other agents, but also allows the intelligent agent to interpret some elements of this model as external value functions (that cannot be done in the basic AIXI). We refer to this “ethical bias” as “universal empathy” and discuss its possible usage in the safe AGI.

Stuart Armstrong: Predicting AI… or failing to

Extended Abstract: Predictions about the future development of artificial intelligence are as confident as they are diverse. Starting with Turing’s initial estimation of a 30% pass rate on Turing test within 50 years (Turing, 1950), computer scientists, philosophers and journalists have never been shy to offer their own definite prognostics, claiming AI to be impossible (Jacquette, 1987) or just around the corner (Darrach, 1975).
What are we to make of these predictions? What are they for, and what can we gain from them? Are they to be treated as light entertainment, the equivalent of fact-free editorials about the moral decline of modern living? Or are there some useful truths to be extracted? Can we feel confident that certain categories of experts can be identified, and that their predictions stand out from the rest in terms of reliability?

In this paper, we’ll start off by proposing classification schemes for AI predictions: what types of predictions are being made, and what kind of arguments or models are being used to justify them. Armed with this scheme, we’ll then analyse some of these approaches from the theoretical perspective, seeing whether there are good reasons to believe or disbelieve their results. The aim is not simply to critique individual methods or individuals, but to construct a toolbox of assessment tools that will both enable us to estimate the reliability of a prediction, and allow predictors to come up with better results themselves.

Those theoretical results will be supplemented with the real meat of the paper: a database of 257 AI predictions (Partially available online at www.neweuropeancentury.org/SIAI-FHI_AI_predictions.xls), made in a period span-ning from the 1950s to the present day. This database was assembled by researchers systematically searching though the available online literature, and is a treasure-trove of interesting predictions. Delving into this will enable us to show that there seems to be no such thing as an “AI expert” for timeline predictions: no category of predictors stands out from the crowd.

The final point of interest is the unexpected robustness of some philosophical ar-guments. Philosophers making very general meta-arguments seem to have higher added value to the reader than computer scientists giving an expected date for the arrival of AI.

Bibliography

Darrach, B. (1975). Meet Shakey, the First Electronic Person. Reflections of the Future.
Jacquette, D. (1987). Metamathematical criteria for minds and machines. Erkenntnis.
Turing, A. (1950). Computing machinery and intelligence. Mind, 433-460.

Andrzej M.J. Skulimowski. Trends and Scenarios of Selected Key AI Technologies until 2025

Extended Abstract: The main objective of this paper is to present the results that have been obtained during an AI-related foresight exercise [1] carried out during the period 2010-2012 and co-financed by the ERDF. This project aimed at extracting, formulating and analysing the general rules and principles that govern the evolution of selected artificial intelligence (AI) technologies in order to forecast AI trends and impacts until 2025.
The foresight process has been organised within the framework of information fusion in tailored expert systems, termed from now on foresight support systems (FSS). This particular class of expert system is distinguished by a special attention paid to combining of qualitative expert knowledge acquired in Delphi and similar exercises with quantitative trends and forecasts. Another feature of FSS is the ability to take into account other kinds of knowledge, such as patent or bibliographic data fed by automatic webcrawlers and process them with quantitative trends in a uniform way in real-time. Technological evolution is modelled as a discrete-continuous-event system. The sequences of elementary events may be linked by trends, thus forming extended episodes. These are clustered to form 3-5 scenarios. In the knowledge processing model, general technological and economic trends as well as social demand form the inputs to a control system with feedback [2], while the outputs model the impact of technological and social demand on R&D in AI, supply of intelligent products, and their consumption patterns.

The technological focus areas of the project are listed below:

Expert systems, including decision support and diagnostic systems, and recommenders
Machine vision, neurocognitive systems and technologies, such as BCI
Autonomous systems: mobile, stationary, or virtual
Key AI application areas, specifically e-government, e-health, m-health, and e-commerce
Molecular and quantum computing.

Basic ICT technologies serve merely as input or explanatory variables for the remaining technologies and applications. The study on quantum computing differs slightly from the other subjects as the industrial applications in this area are still in the prototype phase. This task serves mainly to determine R&D priorities and the impact on other AI technologies.

Knowledge management methods and analytical IT tools have been combined in the FSS in order to achieve the project objectives described above as well as being independent research aims in their own right. These include:

An ontological knowledge base which stores data together with technological, economical and social evolution models, trends and scenarios
Algorithms of multicriteria rankings suitable for assessing AI-related technologies and capable of generating constructive recommendations for decision makers as regards strategic technological priorities and R&D investment strategies
A foresight specific Analytic Machine coupled with the knowledge base that includes a.o. trend-impact and cross-impact analysis, elementary scenario clustering, and technological roadmapping, making possible a detailed analysis of AI development trends and scenarios in the above-mentioned areas

Products, technologies and their real-life applications in selected technological areas have been submitted by the industrial partners of the project and stored in the knowledge base, serving as a basis for benchmarking.

The above FSS is capable of handling complex consumer preference models concerning intelligent hardware and software. They are then applied as inputs to technology and market models concerning education and health care services, media, internet advertising and quantitative information markets. As a result, the characteristics of technological evolution elaborated within the project provide clues to intelligent technology providers about future demand. They can also give R&D and educational institutions some idea on the most likely directions of develop¬ment and demand for AI professionals, as well as inform authorities and other interested parties on the potential threats and disruptive changes that may be caused by the irresponsible implementation of some AI technologies.
As a real-life application example, this paper will analyse further the evolution of decision support systems (cf. [3] for preliminary results) and their impact on technological progress, consumption patterns and social behaviour. Among other qualitative trends, the emergence of autonomous decision systems will be studied in more detail. These can be classified from the point of view of freewill degree (cf. [4]) and may be embedded in decision support systems transforming them into actual co-decision makers. Another related trend, driven by the development of web technologies and data storage capacities, is the global integration of knowledge bases and monitoring systems. We will discuss the impact of the AI trends discovered on intellectual production capacities, privacy and social behaviour. Furthermore, as regards the impact on the IT industry, we will show how recommender systems can potentially aid software compa¬nies in the development of e-commerce applications.

Carl Shulman. Could we use untrustworthy human brain emulations to make trustworthy ones?

Extended Abstract: One possible route to creating Artificial General Intelligence (AGI) is by creating detailed models of human brains which can substitute for those brains at various tasks, i.e. human Whole Brain Emulation (WBE) (Sandberg and Bostrom, 2008). If computation was abundant, WBEs could undergo an extremely rapid population explosion, operate at higher speeds than humans, and create even more capable successors in an “intelligence explosion” (Good, 1965; Hanson, 1994). Thus, it would be reassuring if the first widely deployed emulations were mentally stable, loyal to the existing order or human population, and humane in their moral sentiments.
However, incremental progress may make it possible to create productive but unstable or inhuman emulations first (Yudkowsky, 2008). For instance, early emulations might loosely resemble brain-damaged amnesiacs that have been gradually altered (often in novel ways) to improve performance, rather than digital copies of human individuals selected for their stability, loyalties, and ethics. A business or state that waited to develop more trustworthy emulations would then delay and risk losing an enormous competitive advantage. The longer the necessary delay, the more likely untrustworthy systems would be widely deployed.

Could low-fidelity brain emulations, intelligent but untrusted, be used to greatly reduce that delay? Trustworthy high-quality emulations could do anything that a human development team could do, but could do it a hundredfold more quickly given a hundredfold hardware improvement (perhaps with bottlenecks from runtime of other computer programs, etc). Untrusted systems would need their work supervised by humans to prevent escape or sabotage of the project.

This supervision would restrict the productivity of the emulations, and introduce bottlenecks for human input, reducing the speedup.
This paper discusses tools for human supervision of untrusted brain emulations, and argues that supervised, untrusted brain emulations could result in large speedups in research progress.

Anders Sandberg. Ethics and Impact of Brain Emulations

Extended Abstract: This paper aims at giving an overview of the ethical issues of the brain emulation approach, and analyse how they should affect responsible policy for developing the field.

There are four domains of ethical issues:

Work aiming at developing the technology and emulating animal systems.
The weakening of the unitary nature of death
Emulations of human individuals or humanlike minds.
The existential risks posed (or reduced) by emulation technology.

Robin Hanson. Envisioning The Economy, and Society, of Whole Brain Emulations

Extended Abstract: The three biggest changes in history, so far, were at the arrival of humans, farming, and industry. If a similar big change lies ahead, a good guess for its source is artificial intelligence in the form of whole brain emulations, or “ems.” Most who consider ems use them to set dramatic stories, or to debate em feasibility and timing, or to ponder their implications for the philosophy of identity. I instead seek social implications – what sort of world would they make?

Most who consider em social implications paint heavens or hells, or seek a new social science for this new social era. In contrast, I seek to straight-forwardly apply standard social science to these novel technical assumptions. I focus on a low regulation scenario, as this is the standard policy reference scenario.

To further ease analysis, I focus on an early em era when brains can be easily copied, but are mostly opaque. This ensures ems have recognizably human mental styles, and precludes partial minds, mixing and matching mind parts, or merging diverged copies. I also assume that the physical cost of running em minds is roughly linear in speed, up to a million times human speed, and that ems can change speed by changing hardware.

Winter Intelligence Conferences

The future of artificial general intelligence

AGI-Impacts: Extended Abstracts