Help us build THE WORLD‘S LARGEST, OPEN, BILLION-SCALE Image-Text-Dataset – Perfect for training DALL-E, CLIP & other multimodal models

Fundraising campaign by Christoph Schuhmann
  • US$75.00
    Donated So Far
Help this ongoing fundraising campaign by making a donation and spreading the word.
Show more
Show less

We already released LAION-400M, the world’s largest openly available image-text-pair data-set with 400 million samples.( https://laion.ai/laion-400-open-dataset/ )

- Now we need your support to build a DATASET WITH BILLIONS of IMAGE-TEXT-PAIRS, each of high quality, with text that fits really well to the image.

Together with the data-set we will also release a K-nearest neighbor index, that allows you to explore the data-set easily by performing image and text similarities searches.

Is this enables you to easily domain specific data sets by filtering out sub sets.


Here can you find a graphical user interface for the index be released with 400 million data-set:

https://rom1504.github.io/clip-retrieval/?back=https%3A%2F%2Fsplunk.vra.ro%2F&index=laion_400m_64G&query=spiderman+picachu



The project started in April with one PC and evolved on the DALL-E Discord server, Where more and more people contributed by improving the code and donating time on their computers.

As we began to realize in July that we will reach the hundreds of millions within the next months, we decided to aim for several billions of samples and to create a nonprofit organization as a structure that allows us to receive funding for this and hopefully many more exciting projects to come!

Here can you find our homepage:

https://laion.ai

- We all work in our free time, completely for free. Many of us spent hundreds, some of us even thousands of dollars of our private money to get us this far and to release the LAION 400M data-set.


Now we need your support to get to sufficient funding to extend or a data-set into the billions!


For every 5.000$ we get we will be able to extend our data-set by at least 1 billion samples, conservatively estimated, … likely by more!


This will enable researchers & organizations from all over the world to ...

  1. to train gigantic models like DALL-E and CLIP ... BUT MUCH BIGGER (!) ...
  2. to easily build LARGE DOMAIN-SPECIFIC DATASETS with our KNN index and
  3. to easily analyze captions & images of SIGNIFICANT PARTS OF THE INTERNET.


SUPPORT US NOW BY

  • Spreading the word by talking with your friends & colleagues about LAION 400M and our goal to extend it into the 2-digit billions.
  • Sharing the links to this campaign & to LAION 400M on Twitter, Discord, Youtube, Facebook, Instagram and other social media platforms you're using.
  • Using LAION 400M in your research or company to train EPIC MODELS with it and to write AWESOME PAPERS.
  • Donating money to this campaign.



HELP US TO BOLDLY GO, WHERE NO DATA SCIENTIST HAS GONE BEFORE!!!



( If you have any questions, don't hesitate to join the DALL-E Discord server & chat with us in the CRAWLING@HOME section )

Organizer

  • Christoph Schuhmann

Donors

  • Jimmie
  • Donated on Sep 20, 2021
$50.00
  • funny ai
  • Donated on Sep 16, 2021
$15.00
  • Anonymous
  • Donated on Sep 15, 2021
$10.00

No updates for this campaign just yet

Donors & Comments

3 donors
  • Jimmie
  • Donated on Sep 20, 2021
$50.00
  • funny ai
  • Donated on Sep 16, 2021
$15.00
  • Anonymous
  • Donated on Sep 15, 2021
$10.00

Followers

0 followers
No Followers Just Yet...
US$75.00
Donated So Far

Help this ongoing fundraising campaign by making a donation and spreading the word.

Not Ready to Donate?

Did you know a 10 second Facebook share raises an average of $25?

Share on Facebook