An AI Safe Harbor Provision Would Create Guidelines For Development & Safety Without Premature Regulations
The conversation around Artificial Intelligence has started to take on a binary quality, rather prematurely, as if we were debating the two sides of a coin rather than a more complex shape. “Let builders build as is” vs “Regulate.” Ironically, both positions are outputs of acknowledging the incredible early power and promise of the tipping point we’ve reached, but neither incorporate the ambiguity. Fortunately there’s some case law here which might help, and we only have to go back to earlier Internet days and the concept of safe harbor.
‘Safe harbor’ is a regulatory framework which provides that certain conduct won’t break a rule so long as specific conditions are met. It’s used to provide clarity in an otherwise complex situation, or to provide the benefit of the doubt to a party so long as they abide by generally acceptable reasonable standards. Perhaps the most well-known example in our industry is the 1998 Digital Millennium Copyright Act (DMCA) which provided safe harbor to Internet businesses around copyright infringement performed by their end users so long as several preconditions were met (such as direct financial benefit, knowledge of infringing materials, and so on).
The DMCA allowed for billions of people globally to express themselves online, prompted new business model experiments, and created guardrails for any entrepreneur to stay legal. It’s not perfect, and it can be abused, but it met the reality of the moment in a meaningful way. And it made my career possible, working with user generated content (UGC) at Second Life, AdSense, and YouTube. During my time at the world’s largest video site, I coined the ongoing public metric ‘# hours of video uploaded every minute” to help put YouTube’s growth in perspective and frame for regulators how unfathomable and unreliable it would be to ask human beings to screen 100% of content manually.
Now 25 years later we have a new tidal wave but it’s not UGC, it’s AI and, uh, User Generated Computer Content (UGCC), or something like that. And from my point of view it’s a potential shift in capabilities as significant as anything I’ve experienced so far in my life. It’s the evolution of what I hoped — not software eating the world, but software enabling it. And it’s moving very very quickly. So much so that it’s perfectly reasonable to suggest the industry slow itself, specifically stop training new models while we all digest the impact of the change. But it’s not what I’d advocate. Instead let’s speed up creating a temporary safe harbor for AI, so our best engineers and companies can continue their innovation while being incentivized to support guardrails and openness.
What would an AI Safe Harbor look like? Start with something like, “For the next 12 months any developer of AI models would be protected from legal liability so long as they abide by certain evolving standards.” For example, model owners must:
- Transparency: for a given publicly available URL or submitted piece of media, to query whether the top level domain is included in the training set of the model. Simply visibility is the first step — all the ‘do not train on my data’ (aka robots.txt for AI) is going to take more thinking and tradeoffs from a regulatory perspective.
- Prompt Logs for Research: Providing some amount of statistically significant prompt/input logs (no information on the originator of the prompt, just the prompt itself) on a regular basis for researchers to understand, analyze, etc. So long as you’re not knowingly, willfully and exclusively targeting and exploiting particular copyrighted sources, you will have infringement safe harbor.
- Responsibility: Documented Trust and Safety protocols to allow for escalation around violations of your Terms of Service. And some sort of transparency statistics on these issues in aggregate.
- Observability: Auditable, but not public, frameworks for measuring ‘quality’ of results.
In order to prevent a burden that means only the largest, well-funded companies are able to comply, AI Safe Harbor would also exempt all startups and researchers who have not released public base models yet and/or have fewer than, for example, 100,000 queries/prompts per day. Those folks are just plain ‘safe’ so long as they are acting in good faith.
Within the 12 month period, AI Safe Harbor would be extended as is; modified and renewed; or eliminated for general regulations. But the goal is to remove ambiguity + start directing companies towards common standards (and common good), while maintaining their competitive advantages locally and globally (China!).
GET ALL MY POSTS FREE TO YOUR INBOX BY SIGNING UP HERE