It's important right now to tighten up the argument of the video quite a bit. Points are made all over the place without much in the way of a logical sequence. It's useful to make arguments proximate to their evidence, but that can make the video confusing for viewers.
In the crypto video, I laid out how NFTs work before making any arguments (although I did establish a tone with the title card/sequence). The point of the video is also immediately clear. Here's the first paragraph:
At this point you probably know at least a little bit about NFTs. Maybe that they're bad for the environment, maybe that they're aesthetically repellent, or maybe that they're a scam. The emerging genre of games I'll be talking about in this video sets itself apart by using cryptocurrencies and NFTs. Things will get funny, because a lot of this stuff is very stupid, but first we need some definitions.
The AI video currently starts with an explanation/apology for the opening, then gets into the amount of manual labour required to train an AI, then mentions the AI apocalypse BS, then finally transitions to an explanation of neural networks. It's awkward, messy, rambling. Descent into hell is obviously the best structure for a video like this; start somewhere innocuous and then keep going deeper into the subject until I can talk about the apocalypse/nature of truth stuff.
I have done a good job unifying and tightening up everything I've written so far. The technical explanation is still quite long. This article is a nice short one (link removed); I want to get into a little more detail but also explaining supervised/unsupervised learning, as well as the goal of machine learning being picking up patterns in the training data is something I need to add. Later parts use these facts already so I would like it to be in the technical explanation.
In general I think the script is getting there. It has kind of a wandering quality to it, but that is sort of the way I write anyway; I like to surprise people so sudden, seemingly-random transitions between topics work their way in pretty often. This may be why I do badly when I need to pre-structure my writing.
The video proper (Minecraft part excluded) is ~4200 words right now, wanted an absolute maximum of 7500. A lot of the groundwork is done so I think the "higher level issues" part can be a semi stream of consciousness section of around 500-1000 words.
AI Apocalypse theory. Labour issues -> skill decay. Nature of truth -> elimination of time and space. Loneliness and manipulation.
Still getting there. 9000 words right now. The early sections are the main issue, and I am pretty sure I can compress it a lot without losing any substance. Right now it's structured as an overview of generative AI in games and then a reflection on the discourse around AI and then the stuff that actually bothers me about AI.
I think the ending is fine, maybe needs some integration with the earlier parts, but the first two sections should really be combined to pair concrete examples and concepts. Will give the script a better sense of movement and cut the word count quite a bit, probably.
As for the last part, I don't like the doomer-ism. I want to find some positive note to end it on. I can always defer to "tell your friends you love them" etc. but would like to do something more.
The nature of truth talk was kind of forgotten, would be an interesting coda/transition between the gaming/art-focused stuff and the existential talk at the end. I have also neglected to talk about labour issues.
Finally, after I add all that shit, I want to keep the substance of the video to 7500 words or so. The footnotes are a huge chunk of the wordcount right now, it's ~7000 words of human-written work, plus ~700 words at the beginning.
I can imagine web "apps" getting much worse with AI, but people (or enthusiasts at least) aren't going to give up programs and games.
OpenUSD seems like a good idea, although not really applicable for hobby-level graphics work. Anything Adobe supports is sus at this point, maybe it's somehow evil.
A few finer points have not made their way in either. I didn't talk about recommendation algorithms really, barely mentioned the microtransaction stuff. I need to simultaneously expand and shorten the bit about cops, it feels like begging for applause right now, I'm better off sticking to facts and offering some kind of terse statement about it; "the FBI's bread and butter" idea is good enough to carry, so I can take out the "excuses to murder somebody."
I am going to lean into the doomerism a little bit, I think the video will be called AI is rotten to the core or The AI industry is rotten to the core or something like that. Report from Hell is a great suffix but probably harms the video's recommendations. The thumbnail is still fantastic.
Also started thinking about visuals and their flow, the beginning will be very smooth, transitioning from CRT to digital with the warps I used in the Teleglitch video. Back to CRT for the technical explanation, then I'll adapt from there. Would like to bring back the CRT later just because it looks so cool, but I think cutting is fine once the video is really rolling.
Maybe I'll do some kind of montage on the CRT. If I go for something like that I should go in really tight on the screen, so as to present it purely as a visual without reminding viewers of the analog/digital discrepancy. If attention is drawn to the room, they'll get the wrong idea. I am imagining a dramatic speech at a public event on cable TV, the weight/reality of the visual is reinforced by the analog video signal.
For "filler" parts (i.e. anything that's not a set piece) I will try to think of funny confluences between what I'm talking about and games, some ironic visual that the viewer doesn't need to pay total attention too. Will also try to use the P2W tricks, relevant and obscure YouTube videos of varying quality.
The trouble is that I don't have some moral or emotional point to build to, the whole video is flatly negative. I think some statement about the power of the Silicon Valley nuts could be good, elicit a god complex with editing somehow. Thinking of that Crowbcat video with the animation of the guy looking at his reflection (was it about MTX?).
Well, I gave it some more time and I'm pretty happy with the script now. Started editing around the 31st, getting the CRT footage was a hangup but I'm up to 15 minutes of edited footage now, almost out of voiceover so I will tackle that tomorrow. I gave up on the 7500 word limit, I'm well past 10,000 now. Greatly expanded the ending section with more content, greatly expanded the "worried experts" stuff and segued it into a discussion of AI adjustment.
I have been experimenting with Stable Diffusion, and looking at a lot of art people have made with it, and I think in my next editing pass I should emphasize that this stuff is getting really good. I throw around "melted" as an insult several times--and most art generated by novices like me does look melty--but there is a lot of impressive Diffusion stuff.
Am I wrong about creativity? The AI certainly is not creative, that is not in doubt. The ethical issues are there too. But going back to the photography analogy that AI people use, there actually are a lot of levers for creativity if you use the Stable Diffusion webui. Inpainting, Lora, Hypernetworks, all of the regular parameters, checkpoint merges...
But the sticking point, I guess, is still that the user is merely nudging the network to do what they want. The AI "artist" is flying blind, writing out different prompts and seeing what happens. Photography is not a process of typing in a description, it's a subtractive labour process that renders the possibilities of the given image into a single, thematic object: the photograph.
I think the commissioning analogy is still appropriate, because the primary skill in AI art-generation is description (plus technical understanding of the model and generation parameters), while no skill is exercised in rendering the images. The user can only give hints at, e.g., composition (striking pose, from underneath, rule of thirds, etc.). The precise composition and content is a matter of chance, not decided by the user so much as the user's patience in regenerating hundreds of times. The same goes for colour choices, lens characteristics, and so on.
A very distinct difference between generating images with AI versus making them is that AI fails to integrate artifacts in the image. CRT TV passes/NTSC filters/glitching/JPEG artifacts are pretty important signifiers in digital art, and in my experience AI either smooths them over or hyperfixates on them and garbles the final image.
The parameters of an AI are ways to manipulate the probability of a nice-looking output.
I should also stress in the script that the phenomenology of AI art creation matters very little. Arguments about ethics and philosophy are for peons like us, while the people with money and power are going to do whatever they want and hire somebody to argue that it's ethical.
I got so much work done yesterday! Around ten minutes of video produced, I'm working at a fantastic pace right now.
This morning I did a final pass of the script, adding some copyright-related info I found, revising the ending, and changing parts to be more readable. I am pretty confident that I can call it locked now and finish up the VO. Really happy with the ending section, it perfectly captures the report from hell/descent into hell idea, jumping from content algorithms to AI policing to the fight for the future, and tying it into the discourse-points about AI like bias and AI adjustment.
It's a great expression of what I'm trying to say in general, capturing that passage between concepts and concrete reality in a way that describes reality and exposes contradictions.
Too tired to edit this morning, so it's time for a test render.
In general: the content is good, voice is edited much too tightly in the early parts. I may need to recut the ML explanation, let each sentence breathe a little more, and rerecord it. I fixed a major problem that I've encountered a couple of times: (link to a tutorial I made for myself).
Up until the ML explanation I'm really happy with, the moment the music kicks with the second CRT segment is awesome. Very "cyber". Really, I just want to ruminate a little more on 04:44-10:32 and decide if it needs scripting or VO rework. Redoing the explanation will be painful but I should do it if I need to.
On further reflection, I think it's pretty good. I may have been half-asleep the first time I watched it.
I'm not going to address it with this particular video, but all my videos suffer from the same general problem: once I get into the meat of a subject they are always really fun, but the early parts tend to be a bit laborious. This is an artifact of my research and reasoning process, but maybe in the future I can punctuate the more-boring parts with weird facts or edits. Pay to Win 2 was pretty good with this because I decided to establish the themes upfront with the Pokemon talk, gives the viewer something to chew on while I Read Out All The Information.
The Minecraft joke feels a little disconnected here. I think it's a good inclusion despite its length but it doesn't link directly with the main video.
Voice was very hot in places. Mostly the tail end of session 1/early session 2? Lots of highs. I don't know how, I've been filtering pretty heavily but I removed the voice comp in Premiere and added a DeEssher. I should really test my voice on different headphones before making these changes but whatever. I will test on different headphones before the final export. At the very least I'll get an idea of the DT 770s' limits.
The aesthetics section was great, perfect atmosphere, speed, performance. The entire end of session 2 was performed really well.
- [x] 10:11 Cut between "once an economy..." and "they're hard to get rid of" is a little awkward.
- [x] 11:03 Shorten the second quote holy shit it's so long
What if cats are hidden? What about these silly cats?
- [x] 12:17 Rewrite the explanation for clarity It's fine.
- [x] 13:06 It's not clear that this is a Mongol horde
- [x] Do I make the need for human data labelling clear?
- [x] 15:13 Much louder than previous section (check this against compressor changes in the project). This is the part I described as too hot.
- [x] 16:55 reduce the volume on the Facebook thing
- [x] 17:26 mosaic the taskbar
- [x] 19:16 missed a text mosaic
- [x] 31:40 missed a few text mosaics
- [x] 31:58 there's a frame of the interviewer at the beginning
- [x] 35:42 crossfade is awkward, just go to the shot of SD after it generated the guy. Keep the click and zoom timing as is.
- [x] 35:56 some kind of flashing in the sped up video, probably tabbed out
- [ ] Stable Diffusion shots in general consider scaling up the prompts to help the viewer resolve what's happening.
- [x] 41:00ish all of the text overlays in the DDS part are hard to pay attention to with the talking, maybe slow that section down and trade off between talking and text?
- [x] 42:45 Music is too loud? Sounds like the voice and music are interfering on the master compressor?
- [x] 43:00 Show the orb when you say "scanning people's retinas," so it doesn't come out of nowhere with the WebP joke.
Wow, I finished the edit. A few voiceover lines to re-record, maybe some pacing changes, but the cut is done in under 15 days. Music and video credits are looming, that part always sucks. Really proud of myself. The beginning is still too long, but it's in a state that I'm happy with. Everything kind of clicked again in the last couple days and I'm enjoying the atmosphere of the video, especially as it goes on. I still have that persistent problem where the beginnings of videos are dense and boring, but it's too late now, hopefully the memes keep most people on board.
Found something interesting I might want to incorporate: Directive Games is a studio that makes crypto games and they gave a talk at a big computer graphics conference, SIGGRAPH, showing off some AI tools for procedural generation (pretty cool ideas honestly--block locations out in 3D, AI fills them out is what I'm getting from skipping through the video). Specifically it was a side-event called "HIVE" sponsored by Houdini, a 3D suite like Blender. The devs used Houdini to block out 3D scenes.
ANYWAY, there are some cool visuals and dense slides that I can use. Specifically one lists a bunch of potential uses for AI in games, by no means exhaustive but possibly useful anyway.
- [x] "well the games industry is just getting started" good spot for that Directive Games slide.
- [x] 15:34 good spot for some Directive Games footage too, after the Photoshop window
- [x] First 15 minutes: way overcompressed, consider rerec
- [x] 15:45 VO is loud
- [x] 17:15 bitrot in blockland too loud?
- [x] 17:30 uncensored bookmarks
- [x] 18:28 add background?
- [x] 19:50 rephrase, too harsh. Change to: AI-generated or just generic
- [x] 19:50 ALSO show the harry styles article here
- [x] 21:00 spoiler warning line comes a little too late
- [x] 22:40 game audio maybe a little too loud, already have subtitles you might as well turn it down
- [x] 22:58 de-esser on this, reduce levels 1-3dB
- [x] 23:55 cut out music during line to make it more awkward
- [x] 26:34 Oblivion audio vid is a little harsh? Idk
- [x] 27:50 remove breath here?
- [x] 28:55 plosives? Or is the mic fucked?
- [x] 29:20 license on this photo?
- [x] 30:00 de-ess princess jane
- [x] 32:00 clip doesn't really follow my thought, see if there's a better place for it
- [x] 34:00 really obviously manipulated. RE-REC
- [x] 34:41 Mosaic
- [x] 35:52 This still has that flashing window problem, I guess I tabbed out twice?
- [x] 41:41 Mosaic
- [x] 41:50 tighten up the transition to EB title screen
- [x] 42:54 Mouse click/mic noise, fade VO earlier
- [x] 43:29 Too tight
- [x] 43:40 I think this clip got nudged or something, black frames around here
- [x] 44:15 A clip DEFINITELY got nudged
- [x] 44:31 Fade in sound later/faster, Lex interrupts me here
- [x] 46:27 Go even further beyond!!!
- [x] 47:12 Mosaic
- [x] 47:28 Maybe scroll the prompt in as I read it?
- [x] 48:00 Music ends randomly
- [x] 48:45 Fucked mic making noises
- [x] 50:05 Change to "branding tool"
- [x] 50:14 Switch "solid colors" off or change text properties. It's uggo
- [x] 50:20 Mic pops on the last line. FUUCK
- [x] 51:58 change to stroke text, white shadow is ugly
- [x] 52:20 Mosaic
- [x] 52:20 So many essses, just filter. The de-esser didn't work before
- [x] 54:49 Mosaic. Also add white stroke for readability.
- [x] 55:21 Mosaic
- [x] 56:20 Music should not cut out here
- [x] 56:50 Mosaic
- [x] 57:27 Mosaic
- [x] 57:50 Maybe zoom on the "choice architecture" thing?
- [x] 58:14 Music is fucked again
- [x] 58:25 This entire section is extremely loud
- [x] 58:40 Remove audio for the spiderman elsa stuff
- [x] 59:10 Again too loud
- [x] 59:43 Black frames here for some reason
- [x] 1:00:15 Black frames again at the end of this clip
- [x] 1:00:56 Mosaic
- [x] 1:01:07 change to "Data collection is a solved problem. Analyzing and using it is not."
- [x] 1:01:34 desynchronized slap? What is going on?
- [x] 1:03:13 Mosaic
- [x] 1:04:23 Extend L&P section and upload a test clip
- [ ] 1:06:00 Maybe add a textual note about the term "GenAI" and Gartner?
- [x] 1:07:46 Re-rec, mic is fucked
- [x] 1:09:53 Add new VO here then edit. This is not a "quiet contemplation" part, cut it normal-tight.
- [x] 1:10:08 Sync the cut with music
- [x] 1:11:15 remove this or the previous quiet part
- [x] 1:12:20 plosive
- [x] 1:12:35 plosive
- [x] 1:14:00ish edit the video more, show stuff to keep the pace of the VO
- [x] 1:14:40 tighten up a lot
- [x] 1:16:54 fade music out slower
- [x] 1:17:30 music is a little loud. Maybe cut just the bass/low-mids? Reconsider.
There's a Fallout TV show coming out?? And they used AI to generate a promo image https://kotaku.com/fallout-tv-amazon-ai-art-bethesda-strike-release-date-1850772308. It worked on me, too. I didn't notice the obvious issues with it.
AI-generated game assets
Switch store:
- Hentai Stars by RedDeer.Games
- Wroom Wroom Puzzle by Prison Games
- Prison Games has plenty of AI generated splash screens (Sherlock Purr and Crazy Trucks are very obvious)
- Rock'N Racing Off Road DX??? By EnjoyUp Games. Splash could be AI-generated
- Hentai Girls by RedDeer.Games
- Hentai World by RedDeer.Games (man, they're horny)
15.ai is an interesting story, especially the MLP stuff. https://en.wikipedia.org/wiki/15.ai
ChatGPT is not even lying.
Is a neural network a self-calibrating statistical "correlator"? A good video I watched called AI a "universal function approximator." This is a fantastic description.
When training with a cost function, the network aims to minimize the cost. The particular topology of the network can make it better for certain things e.g. image generation or digit identification but at the end of the day machine "intelligence" is just statistics.
It is what economists imagine people are doing (?)
- [x] Where do training weights come from?
- [x] What is training?
- [x] Revise intro (see August 8th 2023)
- [x] More games! AI Dungeon
- Other AI-in-games topics, like actual (enemy) AI, whatever Capcom is up to... find more examples.
- [x] Clarify the surveillance talk
- [x] Copyright talk
- [ ] C2PA?
- [x] Obvious AI issues
- [x] Art theft/copyright
- [x] Bias
- [x] Higher level AI issues
- [x] The AI apocalypse
- [x] Data labeling and labour
- [x] Captcha!
- [x] Skill decay
- [x] Code helpers/writing helpers
- We don't know what we're losing (even if the skill decay argument is not good, we don't know what AI will do to human knowledge)
- [x] The nature of truth
- [ ] Adobe's new image expansion/smart fill stuff, Google's Magic Editor, voice cloning, deepfakes... "Partially generative" tools
- [x] Marketing
- [ ] Elimination of time and space
- Ties in with the nature of truth: what is a photograph edited with AI? When?
- [x] Loneliness
- [x] Manipulation--AI driven adtech, AI driven microtransactions
- [x] Shorten/integrate loneliness and manipulation stuff.
C2PAA committee headed by Adobe to seemingly put digital signatures on all files. These would then be automatically amended any time the file is changed. Likely a way to keep training data semi-pure. Too niche and weird and disconnected to cover in the video, but for future research: there must be some trusted administrator of provenance, right? Is a secondary goal of C2PA driving Adobe subscriptions?
More about Adobe. Is the movement toward "the cloud" why they've cared so little about piracy? Is their final goal to have all of your effects (or image adjustments) computed in "the cloud"? I would imagine the AI stuff requires an internet connection--at least the intensive things like Firefly. Adobe has been striving toward this goal of fully-automating art production for a long time (Craig Mullins interview), and AI is a useful way to secure their product: it has to run on their machines, so you can't just crack the software.
There is definitely a video worth of Adobe talk alone.
What AI offers consultants is a new technology of human experience; they sell new ways to read and manipulate people ([memorable.io]).
Gartner, is on the S&P 500 and apparently works with many large corporations (consultation I guess?) and are hyping up AI.
ImageNet is a very historically important dataset for object recognition in images. They used Amazon's Mechanical Turk to label their dataset
Content-culture primes us for AI; all art treated as equally valid on a flat playing field. We accept slop, no aesthetic sense. The main thing a consumer can do to "combat" AI art is to stop engaging with bad art altogether, the vast majority of AI art should go with it.
If some dogshit live action Disney/Nickelodeon show is "worthy" of a six-hour long analysis, then there's really no bulwark against AI art. We may be able to spot it, but critics have failed to equip people to deal with AI art, no one has any reference for what is worthy of attention, because attention is marketized and detached from any concept of quality. This is also a ground that cannot be recovered, like trying to reassemble a million shards of porcelain.
e.g. Ymfah's AI-generated videos become a threat to "art" because we have no idea what art is.
The critique of AI art stops at ethics because there is no artistic standard to defer to, but morality is blurry, subjective, easy to deny. Morality is useless in a world without a common sense of truth. It is totally hollow if society's real language is finance. The phenomenology of AI is harder to refute, requires either a thorough treatment (digging into the truth to find different counter-arguments and -evidence) or a denial of reality.
Good art is defined by meta qualities that somehow connect painting, writing, music. Internal consistency. The skill to find all possible creative decisions and make them with one hermeneutic (this is what we call the author's voice). Great art embodies its concept. This is all very vague.
I. What is AI? (Computer Magic)
- AI stuff requires a lot more manual labour than you'd think
- Material concerns, structure of a neural net
- Nodes = functions
- Activation functions
- Weighting
- Where does training data come from?
- The third world
- Mechanical Turk
II. AI Art (Cute Dog Pixel Art Van Gogh Featured on Artstation) This should be the bulk of the video.
- All this tech is fine and good, but what does it do?
- Looking at AI art
- Decoherence in Facebook's song thing
- Can AI be creative?
- The use of AI in games
- The use of AI in content
- Google will get so much worse
- Arms race between AI-assisted algorithm exploitation and search filtering
- AI-generated work will probably be acceptable
- Everything will get worse but we'll live with it probably.
- We don't know what we're losing (skill decay)
IV. Theories of Intelligence (Sci-fi is My Life)
- The terminology around machine learning
- Rationalism/affective altruism
- Computers aren't conscious; the idea that these networks pass some threshold and become conscious is magical thinking.