Building a Production-Grade Authentication Service for SaaS
From EdTech MVP to Multi-Tenant Platform

Repository: https://github.com/AkshayThoolkar/ai_edtech_auth_service.git

1. Introduction: The Real Problem

In the software world, the standard advice is almost always: “Don’t roll your own authentication.”

They are usually right. Authentication is security-critical infrastructure where mistakes are expensive. Mature providers like Auth0, Cognito, and Descope exist precisely because identity is hard to get right.

And yet, I built my own.

I didn't do it because I wanted to reinvent cryptography or because commercial tools are bad. I did it because, in my specific business context, owning authentication became strategically necessary. For a high-growth B2C platform, "just buying auth" wasn't a silver bullet—it was a tax on my unit economics.

This article details the business logic behind that decision, the architecture of the production-grade system I built to replace the vendors, and how I designed it to evolve into a multi-tenant identity provider for my future ventures.

2. Context: The EdTech Platform

To understand the decision, you have to understand the product. I was building an AI-powered EdTech platform targeting students and learners.

The constraints were clear:

• User Base: Highly price-sensitive B2C users.

• Acquisition Model: Heavy reliance on free trials and freemium tiers to drive growth.

• Scale: Aggressive goals to reach millions of users.

In a microservices architecture, authentication isn't just a login screen. It is the identity source of truth, the root of authorization decisions, and a dependency for every single API call.

3. Decision: Build vs. Buy

I didn’t start by building this. Initially, I integrated Descope. It was genuinely excellent: fast setup, customizable UI, and a smooth developer experience. For an MVP, it reduced time-to-market dramatically,.

But as I modeled the long-term business logic, the math broke down. At the time, Descope’s basic plan was roughly $249 for 10,000 Monthly Active Users (MAU). That’s approximately ₹2.2 per user.

That sounds reasonable until you project growth:

• 10,000 users → ~$249/month

• 100,000 users → ~$2,500/month

• 1 million users → ~$25,000/month

For B2B Enterprise SaaS with high revenue per user, these costs are negligible. But for a B2C product where many users are not immediately monetized, authentication alone could become one of the largest operating expenses. If authentication sat outside my system, I inherited its pricing and constraints. If I brought it inside, I inherited the responsibility—but also the leverage.

The Trade-off Matrix

I chose the custom path, with one golden rule: Don't invent crypto. I would use only standard, battle-tested libraries for every critical function.

4. System Overview (How it Works)

The service serves as the single source of truth for identity. Here is the lifecycle of a user in simple terms:

1. Registration: A user signs up. They are created in an "unverified" state.

2. Verification: The system generates a 6-digit One-Time Password (OTP), hashes it, stores it, and emails it to the user.

3. Activation: Once the user submits the correct code, the account is marked active.

4. Token Issuance: The system issues two tokens:

◦ An Access Token (short-lived) for making API calls.

◦ A Refresh Token (long-lived) for keeping the user logged in without re-entering credentials,.

We also support Passwordless Login (logging in via OTP) and Google OAuth 2.0 for users who prefer social login.

5. Architecture & Design

The codebase is designed to be "boring in a good way"—predictable, observable, and layered.

• Framework: FastAPI (Python). Chosen for its async performance and automatic Swagger documentation.

• Database: PostgreSQL. The industry standard for relational data integrity.

• ORM: SQLAlchemy with Pydantic schemas for data validation.

• Migrations: Alembic for managing database schema changes.

The application is structured to strictly separate concerns: API Routers Pydantic Schemas Service Logic CRUD Operations Database Models,.

This ensures that business logic (like "how long is an OTP valid?") is decoupled from the raw database queries.

6. Security Design Decisions

This is where the "Buy vs. Build" decision carries the most risk. To mitigate this, I implemented several layers of security hardening.

1. Hashing Strategy

We use bcrypt (via passlib) for password storage. However, we went a step further: OTPs are also hashed before storage. If the database were leaked, an attacker would see hashed strings, not the active login codes that are currently sitting in users' inboxes.

2. The "JWT Logout" Problem

JWTs are stateless, which makes "logging out" difficult (you can't just delete a session file on the server). I solved this by implementing a denylist strategy.

• Every refresh token has a unique ID (jti).

• When a user logs out, that jti is added to an invalidated_tokens table.

• The middleware checks this table before refreshing any session. This provides the scalability of JWTs with the control of server-side sessions,.

3. Rate Limiting

We implemented two distinct layers of protection:

• Global Limiter: Restricts requests per IP (e.g., 100 req/min) to prevent DDoS.

• Targeted Limiter: Restricts specific actions per email address (e.g., 5 login attempts per minute). This prevents brute-force attacks on specific user accounts.

7. Implementation Highlights

The system exposes REST endpoints covering the full identity lifecycle:

• Auth Flow: /register, /login, /refresh-token, /logout

• OTP Flow: /request-otp, /verify-otp (handles verification and passwordless login)

• OAuth: /google/login, /google/callback (OpenID Connect standard)

• Ops: /health, /me (introspection),.

Every request passes through a middleware stack that enforces security headers (CSP, X-Frame-Options) and logs requests with warnings for slow responses.

8. Scaling & Performance

The service is stateless (except for the Postgres dependency), meaning it can be scaled horizontally behind a load balancer.

• Async/Await: FastAPI allows handling thousands of concurrent connections efficiently.

• Indexing: Database fields queried frequently (like email, google_id, and refresh_token_jti) are indexed for O(1) lookup speeds.

• Database Pooling: We manage connection pooling to prevent database overload during traffic spikes.

9. Multi-Tenant Roadmap

While the current deployment serves a single EdTech platform, I designed the data model to evolve into a multi-tenant identity provider—effectively running my own internal Auth0.

To make this leap, the system is ready for specific schema changes:

1. Tenant Isolation: Introducing a tenants table and scoping email uniqueness to (tenant_id, email) rather than globally.

2. Token Claims: Adding tenant_id to the JWT claims so downstream services know immediately which "app" a user belongs to.

3. Distributed State: Moving the in-memory rate limiting to Redis to support distributed workers.

10. What I’d Improve (The "Honest" Part)

Building this wasn't without tradeoffs.

• Email Deliverability: When you roll your own auth, you are responsible for the emails landing in the inbox. I use external SMTP providers (like SendGrid/SES), but managing this reliability is an ongoing task compared to a managed service.

• Complexity: I had to implement refresh token rotation and security headers manually. These are checkboxes you get "for free" with Auth0.

However, these were known costs I was willing to pay for ownership.

11. Business Impact

This project wasn't just about saving $249 a month. It was about structural economics.

• Cost: My auth costs are now effectively flat, regardless of how many free users sign up.

• Leverage: I now have a reusable identity asset I can deploy for any future SaaS product in minutes.

12. Who This Is For

This architecture is ideal for:

• B2C Founders who need to support large volumes of free/freemium users.

• Indie Hackers who want to avoid vendor lock-in early.

• Product Managers who need specific custom flows (like passwordless OTP) that vendors might overcharge for.

13. Links & Code

The full source code is open source and available for study or forking.

• GitHub Repository: https://github.com/AkshayThoolkar/ai_edtech_auth_service.git

Building a Production-Grade Authentication Service for SaaSFrom EdTech MVP to Multi-Tenant Platform