2025-07-03

Toronto AI Safety Meetup
- https://www.meetup.com/toronto-ai-aligners/events/308307178/
- based on: Gradual Disempowerment: Systemic Existential Risks from Incremental AI Development
  - summary website: https://gradual-disempowerment.ai/
- wants talk to be interactive, beginner friendly, ask questions if needed
- ignoring powerseeking AI
- how do humans control civilization?
- problems
  - standard: AI more powerful -> AI influence seeking -> lose
  - this talk: AI more powerful -> existing influencers use ai -> lose
- there's no plan
- vignette
  - work becomes more intense
    - we'll become AI managers (or make-work)
      - less "real" jobs
  - people get into AI companions
  - AI controlling things becomes gold standard
  - harder to tell what's happening
    - less first-hand knowledge of things from people
  - new cultural movements
    - AI rights
    - having human kids becoming irresponsible
- argument in detail
  - societal forces pressure towards influence seeking
  - existing alignment requires human influence/utility
    - governments want economy to grow, which requires human
    - humans are involved in anything, so they have power
  - AI will displace that alignment
- aligning institutions
  - states
    - states will have poor ROI for investing in humans
  - economy
    - now
      - needs a lot of humans
      - high ROI in schools, hospital, farms
    - soon
      - remove need for humans
      - remove need for human-serving industries
      - most investment goes to power plants, data centers, robot factories
      - AI will be replacement, not complement
  - culture
    - now
      - humans influence culture
      - human-friendly norms
        
        e.g. markets closed on night/weekends
    - soon
      - AI girlfriends
      - super-tiktok
      - internal AI culture optimising for growth
      - AI rights movement
        
        harder to reverse AI rights than to grant them
- mutual reinforcement
  - we can identify gradual disempowerment and stop it?
    - buy human
      - inferior products
      - regulations outlawing unsafe human services
    - ban AI ownership
      - AI lobbyists arguing for AI growth
    - vote for pro-human candidates
- what to do
  - enhance human agency
    - superhuman negoitation?
    - train LLM using data up to 1930s, ask it to forecast
- measuring it
  - whales rarely get cancer
    - whales have tumours on their tumours, which kill the original tumours
    - [https://www.medicalnewstoday.com/articles/325178]
- cultural evolution meets AI
  - janus
- open directions
  - predict cultural arguments on AI welfare
  - robustness of property rights
  - how do humans control civilizations
  - model entire civilisations using LLMs
    - this is cool
  - what would you do with
    - 100k copies of yourself
    - trustworthy subagents
    - proper precommitments
- Field-building
  - ICML+NeurIPS workshops