Files
ALwrity/GSC_INTEGRATION_README.md
Om-Singh1808 0a7d9bfd21 feat: Complete Google Search Console integration with Clerk authentication
- Add GSC API service with OAuth2 authentication
- Implement Clerk authentication for frontend and backend
- Add GSC login button and OAuth callback handling
- Create comprehensive GSC data fetching and caching
- Add authentication middleware for backend API protection
- Implement real-time GSC data integration in SEO dashboard
- Add user-specific GSC site management
- Include comprehensive logging and error handling
- Add TypeScript support and proper type definitions
- Create environment templates and setup documentation
- Update gitignore to exclude sensitive credential files

Features added:
- GSC OAuth2 authentication flow
- Real-time search analytics data
- Site list management
- Sitemap analysis
- User-specific data isolation
- Comprehensive error handling
- Authentication token management
- Popup-based OAuth flow
- Data caching and refresh mechanisms

Note: gsc_credentials.json should be created locally with your Google OAuth credentials
2025-09-24 15:12:21 +05:30

7.7 KiB

Google Search Console (GSC) Integration for ALwrity

This document describes the complete Google Search Console integration implemented for ALwrity, allowing users to connect their GSC accounts and fetch real website analytics data.

🚀 Features

Backend Features

  • OAuth2 Authentication: Secure Google OAuth2 flow for GSC access
  • User Credential Management: Encrypted storage of user OAuth tokens
  • Data Caching: SQLite-based caching system for GSC data
  • Multi-user Support: Each user can connect their own GSC account
  • Real-time Analytics: Fetch live search analytics, sitemaps, and site data
  • Comprehensive Logging: Detailed logging throughout the system

Frontend Features

  • GSC Login Button: Seamless OAuth connection flow
  • Status Management: Real-time connection status display
  • Popup Authentication: Secure OAuth flow in popup window
  • Error Handling: Comprehensive error management and user feedback
  • Responsive UI: Material-UI components matching existing dashboard style

📁 File Structure

Backend Files

backend/
├── services/
│   └── gsc_service.py              # Core GSC service with OAuth and data management
├── routers/
│   └── gsc_auth.py                 # FastAPI router for GSC endpoints
├── middleware/
│   └── auth_middleware.py          # Clerk authentication middleware
├── gsc_credentials.json            # Google OAuth2 client credentials
├── env_template.txt                # Environment variables template
└── requirements.txt                # Updated with GSC dependencies

Frontend Files

frontend/src/
├── api/
│   └── gsc.ts                      # GSC API client
├── components/SEODashboard/components/
│   ├── GSCLoginButton.tsx          # GSC connection UI component
│   └── GSCAuthCallback.tsx         # OAuth callback handler
├── env_template.txt                # Frontend environment template
└── package.json                    # Updated with Clerk dependencies

🔧 API Endpoints

GSC Authentication & Management

  • GET /gsc/auth/url - Get OAuth authorization URL
  • GET /gsc/callback - Handle OAuth callback
  • GET /gsc/status - Check GSC connection status
  • DELETE /gsc/disconnect - Revoke GSC access

GSC Data Retrieval

  • GET /gsc/sites - Get user's GSC sites
  • POST /gsc/analytics - Fetch search analytics data
  • GET /gsc/sitemaps/{site_url} - Get sitemaps for a site
  • GET /gsc/health - Health check endpoint

🗄️ Database Schema

GSC Credentials Table

CREATE TABLE gsc_credentials (
    user_id TEXT PRIMARY KEY,
    credentials_json TEXT NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

GSC Data Cache Table

CREATE TABLE gsc_data_cache (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    user_id TEXT NOT NULL,
    site_url TEXT NOT NULL,
    data_type TEXT NOT NULL,
    data_json TEXT NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    expires_at TIMESTAMP NOT NULL,
    FOREIGN KEY (user_id) REFERENCES gsc_credentials (user_id)
);

🔐 Authentication Flow

  1. User clicks "Connect GSC" → Frontend requests OAuth URL
  2. Backend generates OAuth URL → Returns Google authorization URL
  3. User authorizes in popup → Google redirects to callback
  4. Backend handles callback → Exchanges code for tokens
  5. Credentials stored securely → User can now access GSC data
  6. Real data replaces mock data → Dashboard shows live analytics

🛠️ Setup Instructions

1. Backend Setup

  1. Install Dependencies:

    cd backend
    pip install -r requirements.txt
    
  2. Configure Environment:

    cp env_template.txt .env
    # Edit .env with your actual values
    
  3. Google OAuth Setup:

    • Copy your Google OAuth credentials to gsc_credentials.json
    • Ensure redirect URIs include both backend and frontend URLs
  4. Start Backend:

    python app.py
    

2. Frontend Setup

  1. Install Dependencies:

    cd frontend
    npm install
    
  2. Configure Environment:

    cp env_template.txt .env
    # Edit .env with your actual values
    
  3. Start Frontend:

    npm start
    

🔑 Environment Variables

Backend (.env)

# Clerk Authentication
CLERK_SECRET_KEY=your_clerk_secret_key_here

# Google Search Console
GSC_REDIRECT_URI=http://localhost:8000/gsc/callback

# Development Settings
DISABLE_AUTH=false

Frontend (.env)

# Clerk Authentication
NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY=your_clerk_publishable_key_here

# CopilotKit
REACT_APP_COPILOTKIT_API_KEY=your_copilotkit_api_key_here

📊 Data Types Retrieved

Search Analytics

  • Clicks: Number of clicks from search results
  • Impressions: Number of times site appeared in search
  • CTR: Click-through rate percentage
  • Position: Average position in search results

Site Information

  • Site URLs: List of verified sites in GSC
  • Permission Levels: User's access level for each site

Sitemaps

  • Sitemap Paths: URLs of submitted sitemaps
  • Submission Dates: When sitemaps were last submitted
  • Index Status: Which pages are indexed

🔒 Security Features

  • OAuth2 Security: Google's secure authorization protocol
  • Token Encryption: Credentials stored securely in database
  • User Isolation: Each user's data is completely separate
  • Token Refresh: Automatic token refresh when expired
  • Access Revocation: Users can disconnect at any time

🧪 Testing

Backend Testing

cd backend
python -m pytest test_gsc_*.py

Frontend Testing

cd frontend
npm test

Integration Testing

  1. Start both backend and frontend servers
  2. Navigate to SEO Dashboard
  3. Click "Connect GSC"
  4. Complete OAuth flow
  5. Verify real data appears in dashboard

🐛 Troubleshooting

Common Issues

  1. "Not Found" Error:

    • Check API endpoint paths match between frontend and backend
    • Ensure backend server is running
  2. "Not Authenticated" Error:

    • Verify Clerk API keys are correct
    • Check environment variables are loaded
  3. OAuth Popup Blocked:

    • Allow popups for localhost
    • Check browser popup settings
  4. GSC Data Not Loading:

    • Verify Google OAuth credentials
    • Check user has verified sites in GSC
    • Review backend logs for errors

📈 Performance Optimizations

  • Data Caching: GSC data cached for 1 hour to reduce API calls
  • Lazy Loading: Components load data only when needed
  • Error Boundaries: Graceful error handling prevents crashes
  • Connection Pooling: Efficient database connections

🔄 Future Enhancements

  • Real-time Updates: WebSocket-based live data updates
  • Advanced Analytics: More detailed GSC metrics and insights
  • Bulk Operations: Analyze multiple sites simultaneously
  • Export Features: Export GSC data to CSV/Excel
  • Scheduled Reports: Automated GSC reports via email

📝 Logging

The system includes comprehensive logging at all levels:

  • Backend: Detailed logs for OAuth flow, data retrieval, and errors
  • Frontend: Console logs for API calls and user interactions
  • Database: Query logging for debugging data issues

🤝 Contributing

When contributing to the GSC integration:

  1. Follow existing code patterns and style
  2. Add comprehensive logging for new features
  3. Include error handling for all API calls
  4. Update tests for any new functionality
  5. Document any new environment variables or setup steps

📄 License

This GSC integration is part of the ALwrity project and follows the same licensing terms.