Early history drawing a picture Automated art dates back at least to the
automata of
ancient Greek civilization, when inventors such as
Daedalus and
Hero of Alexandria were described as designing machines capable of writing text, generating sounds, and playing music. Creative automatons have flourished throughout history, such as
Maillardet's automaton, created around 1800 and capable of creating multiple drawings and poems. Also in the 19th century,
Ada Lovelace, wrote that "computing operations" could potentially be used to generate music and poems. In 1950,
Alan Turing's paper "
Computing Machinery and Intelligence" focused on whether machines can mimic human behavior convincingly. Shortly after, the academic discipline of artificial intelligence was founded at a research
workshop at
Dartmouth College in 1956. Since its founding, AI researchers have explored philosophical questions about the nature of the human mind and the consequences of creating artificial beings with human-like intelligence; these issues have previously been explored by
myth,
fiction, and
philosophy since antiquity.
Artistic history '
Galápagos installation allowed visitors to evolve 3D animated forms.Since the founding of AI in the 1950s, artists have used artificial intelligence to create artistic works. These works were sometimes referred to as
algorithmic art,
computer art,
digital art, or
new media art. One of the first significant AI art systems is
AARON, developed by
Harold Cohen beginning in the late 1960s at the
University of California at San Diego. AARON uses a symbolic rule-based approach to generate technical images in the era of
GOFAI programming, and it was developed by Cohen with the goal of being able to code the act of drawing. AARON was exhibited in 1972 at the
Los Angeles County Museum of Art. From 1973 to 1975, Cohen refined AARON during a residency at the
Artificial Intelligence Laboratory at
Stanford University. In 2024, the
Whitney Museum of American Art exhibited AI art from throughout Cohen's career, including re-created versions of his early
robotic drawing machines. In both 1991 and 1992, Sims won the Golden Nica award at
Prix Ars Electronica for his videos using artificial evolution. In 1997, Sims created the interactive artificial evolution installation
Galápagos for the
NTT InterCommunication Center in Tokyo. Sims received an
Emmy Award in 2019 for outstanding achievement in engineering development. In 1999,
Scott Draves and a team of several engineers created and released
Electric Sheep as a
free software screensaver.
Electric Sheep is a volunteer computing project for animating and evolving
fractal flames, which are distributed to networked computers that display them as a screensaver. The screensaver used AI to create an infinite animation by learning from its audience. In 2001, Draves won the Fundacion Telefónica Life 4.0 prize for
Electric Sheep. In 2014,
Stephanie Dinkins began working on
Conversations with Bina48. For the series, Dinkins recorded her conversations with
BINA48, a social robot that resembles a middle-aged black woman. In 2019, Dinkins won the
Creative Capital award for her creation of an evolving artificial intelligence based on the "interests and culture(s) of people of color." In 2015,
Sougwen Chung began
Mimicry (Drawing Operations Unit: Generation 1), an ongoing collaboration between the artist and a robotic arm. In 2019, Chung won the
Lumen Prize for her continued performances with a robotic arm that uses AI to attempt to draw in a manner similar to Chung. '', created with a
generative adversarial network in 2018In 2018, an auction sale of artificial intelligence art was held at
Christie's in New York where the AI artwork
Edmond de Belamy sold for , which was almost 45 times higher than its estimate of –10,000. The artwork was created by Obvious, a Paris-based collective. In 2024, Japanese film
generAIdoscope was released. The film was co-directed by
Hirotaka Adachi, Takeshi Sone, and Hiroki Yamaguchi. All video, audio, and music in the film were created with artificial intelligence. In 2025, the Japanese
anime television series
Twins Hinahima was released. The anime was produced and animated with AI assistance during the process of cutting and conversion of photographs into anime illustrations and later retouched by art staff. Most of the remaining parts such as characters and logos were hand-drawn with various software.
Technical history Deep learning, characterized by its multi-layer structure that attempts to mimic the human brain, first came about in the 2010s, causing a significant shift in the world of AI art. During the deep learning era, there are mainly these types of designs for generative art:
autoregressive models,
diffusion models,
GANs,
normalizing flows. In 2014,
Ian Goodfellow and colleagues at
Université de Montréal developed the
generative adversarial network (GAN), a type of
deep neural network capable of learning to mimic the
statistical distribution of input data such as images. The GAN uses a "generator" to create new images and a "discriminator" to decide which created images are considered successful. Unlike previous algorithmic art that followed hand-coded rules, generative adversarial networks could learn a specific
aesthetic by analyzing a
dataset of example images. The process creates deliberately over-processed images with a dream-like appearance reminiscent of a
psychedelic experience. By conditioning the GAN on both random noise and a specific class label, this approach enhanced the quality of image synthesis for class-conditional models.
Autoregressive models were used for image generation, such as PixelRNN (2016), which autoregressively generates one pixel after another with a
recurrent neural network. Immediately after the
Transformer architecture was proposed in
Attention Is All You Need (2018), it was used for autoregressive generation of images, but without text conditioning. The website
Artbreeder, launched in 2018, uses the models
StyleGAN and BigGAN to allow users to generate and modify images such as faces, landscapes, and paintings. In the 2020s,
text-to-image models, which generate images based on
prompts, became widely used, marking yet another shift in the creation of AI-generated artworks. It is an autoregressive generative model with essentially the same architecture as GPT-3. Along with this, later in 2021,
EleutherAI released the
open source VQGAN-CLIP based on OpenAI's CLIP model.
Diffusion models, generative models used to create synthetic data based on existing data, were first proposed in 2015, but they only became better than GANs in early 2021.
Latent diffusion model was published in December 2021 and became the basis for the later
Stable Diffusion (August 2022), developed through a collaboration between Stability AI, CompVis Group at LMU Munich, and Runway. In 2022,
Midjourney was released, followed by
Google Brain's
Imagen and Parti, which were announced in May 2022,
Microsoft's NUWA-Infinity, DALL-E2, a successor to DALL-E, was beta-tested and released (with the further successor DALL-E3 being released in 2023). Stability AI has a Stable Diffusion web interface called DreamStudio, plugins for
Krita,
Photoshop,
Blender, and
GIMP, and the
Automatic1111 web-based open source
user interface. Stable Diffusion's main pre-trained model is shared on the
Hugging Face Hub.
Ideogram was released in August 2023, this model is known for its ability to generate legible text. In 2024,
Flux was released. This model can generate realistic images and was integrated into
Grok, the chatbot used on
X (formerly Twitter), and
Le Chat, the chatbot of
Mistral AI. Flux was developed by Black Forest Labs, founded by the researchers behind Stable Diffusion. Grok later switched to its own text-to-image model
Aurora in December of the same year. Several companies, along with their products, have also developed an AI model integrated with an image editing service.
Adobe has released and integrated the AI model
Firefly into
Premiere Pro,
Photoshop, and
Illustrator. Microsoft has also publicly announced AI image-generator features for
Microsoft Paint. Along with this, some examples of
text-to-video models of the mid-2020s are
Runway's Gen-4, Google's
VideoPoet, OpenAI's
Sora, which was released in December 2024, and
LTX-2 which was released in 2025. In 2025, several models were released.
GPT Image 1 from
OpenAI, launched in March 2025, introduced new text rendering and multimodal capabilities, enabling image generation from diverse inputs like sketches and text.
MidJourney v7 debuted in April 2025, providing improved text prompt processing. In May 2025,
Flux.1 Kontext by Black Forest Labs emerged as an efficient model for high-fidelity image generation, while
Google's Imagen 4 was released with improved photorealism. Flux.2 debuted in November 2025 with improved image reference, typography, and prompt understanding. == Tools and processes ==