2024-07-04

Toronto AI Safety Meetup
- News
  - Etched (https://www.etched.com/) - transformer ASIC
- Confusing Terminology
  - Types of confusing terminology
    - Jargon, Ambiguity, Non-Concepts, Connotations/Baggage, False Friends, Drift/Appropriation
  - Confusing terms
    - Responsible Scaling Policy (from audience)
      - Maybe bad?
    - Agent
      - Types
        
        Anything that can be viewed as perceiving its environment through sensors and acting upon that environment through actuators
        
        Rational agent that maximizes EV
        
        Software that uses AI to perform tasks or mask decisions autonomously on behalf of a user/organisation
        
        Recent thing, since GenAI became a thing
      - Each of these types is a subset of the last
        
        Maybe GenAI agents aren't rational?
      - Electrons are agents
        
        Humans have more agency than electrons
        
        Not useful though
    - Feature
      - In ML: number in input vector
      - In computer vision: something interesting about an image
        
        Usually first step in processing
      - In interpretability: a vector in one of the hidden layers that corresponds to something interesting or meaningful
        
        E.g. "golden gate" in Golden Gate Claude
    - Artificial Intelligence
      - Original term was very broad
        
        GOFAI (Good Old Fashioned AI)
        
        Machine Learning
        
        AI as we call it today
    - Alignment
      - Meanings
        
        Originally: advanced AI doing specific tasks without catastrophic side effects
        
        Now: making AI be nice
        
        Other meaning: doing what the user wants, not what they ask
      - Inner vs outer alignment
      - Originally "friendly AI"cla
    - Anthropomorphizing stuff
      - Goals
        
        Anthropic's golden gate feature only activates on 2% of mentions of golden gate
        
        Maybe they're just extracting the feature wrong?
      - Reward
      - Understanding
        
        Godel, Escher, Bach book discusses the term understanding
      - Deception
      - Generally treating LLMs like humans
    - FLOP/s (floating point operations per second) and FLOPs (floating point operations)
      - Supposed to use slash
        
        Doesn't work when talking though
        
        Not always used
    - Arms race
      - Actual weapons or technology race
    - Intelligence
    - AI risk
    - Pivotal act
      - An act that changes the trajectory of AI progress a billion years in the future
      - Something that simplifies future decision tree/cone
      - ASI could go really good or really bad, we don't know
        
        Destroying the world is a bad pivotal act, would near-guarantee doom
        
        Harder to near-guarantee not-doom
      - E.g. destroying all GPUs
      - Related: Long Reflection
        
        https://forum.effectivealtruism.org/topics/long-reflection
        
        Book about this: The Precipice: Existential Risk and the Future of Humanity
        
        https://www.amazon.ca/dp/1526600218
    - Governance