xps

Understanding Legitimate Access vs Unauthorized Access

Before writing a single line of code, establish clear ethical and legal boundaries. This tutorial covers authentication automation for resources you already have legitimate access to—not bypassing security, not accessing restricted content, and definitely not hacking.

Legitimate Use Cases: This tutorial applies when you have a university library account and want to automate login, your institution provides proxy access to academic databases, you have personal subscriptions to research platforms, your employer grants you access to internal knowledge bases, or you need to authenticate with multiple services using the same SSO credentials.

What This Tutorial Does NOT Cover: Bypassing paywalls for content you don't have rights to, circumventing institutional access controls, sharing authentication tokens with unauthorized users, accessing resources outside your license agreements, or violating terms of service or acceptable use policies.

The Legal Landscape

Authentication automation exists in a nuanced legal space. Consider these key principles:

Computer Fraud and Abuse Act (CFAA) in the US

The CFAA prohibits accessing computers "without authorization" or "exceeding authorized access." When you automate login to systems you have legitimate credentials for, you're operating within your authorization. However, terms of service violations can complicate this—though recent case law (hiQ Labs v. LinkedIn, 2019) suggests ToS violations alone don't constitute CFAA violations.

International Considerations

EU: GDPR requires lawful processing of personal data (your own credentials)

UK: Computer Misuse Act 1990 focuses on unauthorized access

Canada: Criminal Code Section 342.1 addresses unauthorized computer use

Key Principle: If you can legally log in manually, automating that login is generally permissible. The moment you access something you couldn't access manually, you've crossed the line.

Technical vs Ethical Access Rights

Your institutional license agreements grant you specific rights:

Typical Academic License Rights

Academic licenses typically grant personal use for research, teaching, and education; the right to download and save articles for personal research; the ability to use content in scholarly publications with attribution; and access through institutional proxy or VPN.

What Licenses Usually Prohibit

Most licenses prohibit systematic downloading such as entire journal issues or bulk scraping, redistribution to non-authorized users, commercial use of content, and automated access that disrupts service through high-volume requests.

The core technique in this tutorial is session state preservation. When you log in to a website, the server issues cookies that prove your authenticated status. By saving these cookies and replaying them in future sessions, you avoid repeated manual authentication.

Why Cookie Preservation Is Ethical: You performed the initial authentication legitimately, the session token represents your authorized access, you're not bypassing any security measures, and the server granted you this token explicitly.

Technical Implementation

The implementation involves saving cookies after manual authentication, storing them securely with encryption at rest, restoring them in automation sessions, and respecting expiration and refresh cycles.

SSO/SAML Authentication Patterns

Single Sign-On (SSO) systems, particularly SAML-based institutional logins, are common in academic environments. These involve:

Service Provider (SP): The resource you want to access (e.g., Elsevier)
Identity Provider (IdP): Your institution's authentication system
SAML Assertion: XML token proving your identity

Ethical Automation Approach

The ethical approach involves automating the IdP login using your university credentials, following redirects to complete the SAML flow, preserving resulting session cookies, and never intercepting or forging SAML tokens.

What This Tutorial Covers

This tutorial builds a complete system that authenticates using your credentials stored securely, handles common authentication patterns including forms, SSO, and OAuth, preserves session state such as cookies and local storage, manages MFA challenges with user intervention, refreshes expired sessions automatically, and integrates with Claude Code through MCP.

Security Boundaries

The system maintains strict security boundaries where credentials never leave your local machine, state files are encrypted at rest, there is no credential logging or telemetry, users maintain control over authentication triggers, and an audit trail tracks all authentication operations.

Risk Mitigation

Even ethical automation carries risks that must be addressed:

Account Lockouts: Failed authentication attempts can trigger security measures. Solution: Implement exponential backoff and respect rate limits.

ToS Violations: Some services explicitly prohibit automation. Solution: Review ToS before automating; if prohibited, use manual authentication.

Session Hijacking: Stored cookies are sensitive. Solution: Encrypt state files, use secure file permissions, never commit to version control.

Credential Exposure: Configuration files may contain secrets. Solution: Use environment variables, credential managers, and .gitignore properly.

Responsible Automation Principles

As you build authentication automation, follow these principles: Respect rate limits by avoiding rapid requests to services. Honor robots.txt even in authenticated areas that may have crawler restrictions. Monitor your impact by watching for increased latency or errors. Maintain audit logs to track what you automated and when. Secure your systems by treating stored credentials like production secrets. Stay within license terms since automation doesn't change usage rights. Be transparent by documenting your automation for compliance reviews.

Ethical Foundations