Web

Gmail Check: Verify Email Availability in PHP (SMTP & MX)

Why scraping Google's sign-in fails and how to verify Gmail in PHP: syntax, MX, SMTP RCPT probes and email confirmation. Don't rely on Google sign-in scraping.

1 answer 1 view

How to reliably check if a Gmail email address is available using PHP? Current cURL method returns true for taken emails due to Google blocking.

I need a PHP function to verify if a Gmail address (e.g., username@gmail.com) exists or is available. My current implementation uses cURL to simulate Google’s sign-in process by fetching the identifier page, extracting the XSRF token, and submitting the email. However, it incorrectly returns true (available) even for existing emails, likely because Google detects and blocks the automated requests.

Here’s the problematic code:

php
function checkGmail($mail){
 echo "[Gmail] Testing if exists: $mail\n";
 flush();
 
 $ch = curl_init();
 
 // First: Get initial page to extract XSRF token
 curl_setopt($ch, CURLOPT_URL, 'https://accounts.google.com/signin/v2/identifier');
 curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
 curl_setopt($ch, CURLOPT_TIMEOUT, 3);
 curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 2);
 curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
 curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36');
 curl_setopt($ch, CURLOPT_COOKIEJAR, '/tmp/gmail_cookies.txt');
 
 $res = curl_exec($ch);
 $err = curl_error($ch);
 curl_close($ch);
 
 if($err){
 echo "[Gmail] Network error, allowing\n";
 flush();
 return true;
 }
 
 // Extract XSRF token
 if(preg_match('/\"_XSRF_TOKEN\"\s*\,\s*\"([^\"]+)\"/', $res, $m)){
 $xsrf = $m[1];
 } else if(preg_match('/name=\"_xsrf_token\"\s+value=\"([^\"]+)\"/', $res, $m)){
 $xsrf = $m[1];
 } else {
 $xsrf = '';
 }
 
 // Now check if email exists using the identifier endpoint
 $ch = curl_init();
 curl_setopt($ch, CURLOPT_URL, 'https://accounts.google.com/signin/v2/identifier?service=accountsettings&continue=https://myaccount.google.com&osid=1&flowName=GlifWebSignIn&flowEntry=ServiceLogin');
 curl_setopt($ch, CURLOPT_POST, 1);
 curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
 curl_setopt($ch, CURLOPT_TIMEOUT, 3);
 curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 2);
 curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
 curl_setopt($ch, CURLOPT_HTTPHEADER, [
 'Content-Type: application/x-www-form-urlencoded',
 'User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36'
 ]);
 curl_setopt($ch, CURLOPT_COOKIEFILE, '/tmp/gmail_cookies.txt');
 curl_setopt($ch, CURLOPT_COOKIEJAR, '/tmp/gmail_cookies.txt');
 
 $postData = 'identifier='.$mail.'&_xsrf_token='.$xsrf;
 curl_setopt($ch, CURLOPT_POSTFIELDS, $postData);
 
 $res2 = curl_exec($ch);
 $http = curl_getinfo($ch, CURLINFO_HTTP_CODE);
 $err2 = curl_error($ch);
 curl_close($ch);
 
 // Analyze response
 if($err2){
 echo "[Gmail] Error, allowing: $err2\n";
 flush();
 return true;
 }
 
 // Check response for existence indicators
 if(stripos($res2, 'Could not find your Google Account') !== false || 
 stripos($res2, 'account does not exist') !== false ||
 stripos($res2, 'no account found') !== false){
 echo "[Gmail] ✓ Available (not found)\n";
 flush();
 return true;
 }
 
 if(stripos($res2, 'password') !== false || 
 stripos($res2, 'enter your password') !== false ||
 stripos($res2, 'verify it\'s you') !== false){
 echo "[Gmail] ✗ Taken (account found)\n";
 flush();
 return false;
 }
 
 // If we get HTML form asking for password = account exists
 if(stripos($res2, '<input') !== false && stripos($res2, 'password') !== false){
 echo "[Gmail] ✗ Taken (password form shown)\n";
 flush();
 return false;
 }
 
 echo "[Gmail] Unclear response, assuming taken\n";
 flush();
 return false;
}

Issues:

  • Function assumes availability on errors or unclear responses.
  • Google likely blocks or rate-limits these requests, causing false positives.

Questions:

  • Why does this fail for existing emails, and how can I fix the cURL logic (e.g., better headers, proxies, response parsing)?
  • What are reliable alternative methods to check Gmail availability in PHP (e.g., APIs, SMTP checks, other endpoints)?
  • Best practices for email validation without violating Google’s terms?

Scraping Google’s sign‑in pages with cURL is inherently unreliable for a gmail check — Google deliberately detects and blocks automated sign‑in probes, so your current gmail verify approach will produce false positives. The dependable solution is to verify by sending a confirmation email (or use Workspace APIs for accounts you control); as a fallback in PHP do syntax + MX + disposable checks and a cautious SMTP RCPT probe, but treat any network/ambiguous response as “unknown” and require email verification.


Contents


Why your cURL gmail check fails

Short answer: Google intentionally makes automated account enumeration hard. Your cURL flow (fetch sign‑in page → extract a token → post the identifier) looks like a bot. Google’s sign‑in endpoints are backed by JS, dynamic tokens, CAPTCHAs and anti‑automation heuristics; when a request looks automated they either block it, rate‑limit it, or return an ambiguous/generic page so you can’t tell whether the address exists. The behaviour is discussed in community reports and deliverability analyses — see for example PostmanSMTP’s writeup on Gmail blocking automated flows and general blocking reasons: https://postmansmtp.com/gmail-blocking-emails-issue/ and this explanation of automated sign‑in detection: https://www.jivrus.com/resources/articles/google-workspace/why-emails-are-blocked-by-gmail-and-how-to-resolve.

Concrete failure modes you’ll see in your cURL script:

  • The response is the same for existing and non‑existing addresses (anti‑enumeration).
  • Google returns a JS/CAPTCHA challenge that cURL can’t execute.
  • Google returns HTTP 429/503 or silent redirects; your parser misses these and assumes “available.”
  • IP reputation / rapid requests cause throttling or greylisting, producing inconsistent results.

Because of those behaviours, treating network errors or unclear HTML as “available” is unsafe — that’s the opposite of what you want.


Why scraping sign‑in pages is risky

Two short rules:

  • Don’t rely on HTML scraping to verify user accounts — it’s fragile and will break.
  • Don’t build logic that assumes “no useful response = available” — that opens you to false positives and abuse.

There’s also a ToS and privacy angle: probing sign‑in endpoints or automating sign‑in flows can violate Google’s terms and will frequently get your IP addresses blocked. The official Gmail APIs don’t provide a public arbitrary account existence check (they provide controlled, authorized methods for Workspace-managed accounts only), so you’re pushing against intentional limits — see the Gmail API docs about alias verification for Workspace: https://developers.google.com/workspace/gmail/api/reference/rest/v1/users.settings.sendAs/verify.


Fixes to your cURL logic (if you must try)

If you still want to improve your cURL attempts for debugging or a short-lived experiment, make these changes — but keep expectations low and don’t assume a correct result.

What to do (short list)

  • Don’t treat errors as “available.” Treat network errors, 429, 503, or CAPTCHAs as unknown/blocked.
  • Check HTTP status codes and response headers as well as body. If you get 429/302-to-CAPTCHA/503, treat as blocked.
  • Use proper browser headers and cookie handling (but note: headers alone rarely bypass bot checks).
  • Accept compressed responses (CURLOPT_ENCODING = “” ) and follow redirects (CURLOPT_FOLLOWLOCATION).
  • Use a full JS-capable browser (headless Chromium via Puppeteer/Playwright) if you must run client JS — but this is heavy, brittle, and still may be blocked.
  • Don’t brute‑force or rotate proxies to evade rate limits — that invites blocks and may violate ToS.

Minimal improved cURL pattern (still not guaranteed):

php
$ch = curl_init('https://accounts.google.com/signin/v2/identifier');
curl_setopt_array($ch, [
 CURLOPT_RETURNTRANSFER => true,
 CURLOPT_FOLLOWLOCATION => true,
 CURLOPT_ENCODING => '', // accept gzip/deflate/br
 CURLOPT_TIMEOUT => 10,
 CURLOPT_CONNECTTIMEOUT => 5,
 CURLOPT_SSL_VERIFYPEER => true,
 CURLOPT_USERAGENT => 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36',
 CURLOPT_HTTPHEADER => [
 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
 'Accept-Language: en-US,en;q=0.9',
 'Connection: keep-alive'
 ],
 CURLOPT_COOKIEJAR => '/tmp/gmail_cookies.txt',
 CURLOPT_COOKIEFILE => '/tmp/gmail_cookies.txt'
]);
$res = curl_exec($ch);
$info = curl_getinfo($ch);
curl_close($ch);

// Check $info['http_code'] and look for 429/302-to-CAPTCHA, don't rely on HTML text alone.

Key point: even with better headers and cookies you’ll hit JS/CAPTCHA barriers. At that point you’d need a headless browser — which is resource intensive and still a brittle workaround.


Reliable gmail verify methods with PHP (best‑effort function)

There is no 100% reliable, public programmatic way to confirm arbitrary username@gmail.com addresses without sending mail. The practical options are:

  1. The reliable method: send a verification email with a token and confirm the user clicked it.
  2. Best‑effort pre‑checks in PHP (syntax, MX, disposable domain, optional SMTP RCPT probe). Use them to reduce obvious bad addresses, but always follow up with a verification email.

Below is a pragmatic PHP implementation you can use as a best‑effort pre‑check. It returns three outcomes: deliverable (SMTP RCPT returned a positive code), undeliverable (server explicitly rejected), or unknown (greylist, blocked, network error). Treat unknown as “must verify by email.”

Note: this code performs plain SMTP RCPT checks — that’s allowed in general, but Gmail and other big providers sometimes defer, greylist, or accept then bounce. See community discussion about SMTP checks and limitations: https://stackoverflow.com/questions/19261987/how-to-check-if-an-email-address-is-real-or-valid-using-php and example verifier implementations: https://github.com/hbattat/verifyEmail and https://github.com/reacherhq/check-if-email-exists.

PHP best‑effort verifier (trimmed, production needs more robustness):

php
<?php
// Helper: read multi-line SMTP response until the line that ends the response
function smtp_read_response($fp, $timeout = 5) {
 stream_set_timeout($fp, $timeout);
 $data = '';
 while (($line = fgets($fp, 515)) !== false) {
 $data .= $line;
 // If the 4th char is space, that's end of response block per SMTP
 if (isset($line[3]) && $line[3] === ' ') break;
 }
 return $data;
}

function get_mx_hosts($domain) {
 $hosts = [];
 if (@getmxrr($domain, $hosts, $weights) && count($hosts)) {
 // sort by weight
 array_multisort($weights, $hosts);
 return $hosts;
 }
 $records = @dns_get_record($domain, DNS_MX);
 if ($records !== false) {
 usort($records, function($a, $b){ return $a['priority'] - $b['priority']; });
 $out = [];
 foreach ($records as $r) $out[] = rtrim($r['target'], '.');
 return $out;
 }
 // fallback to A/AAAA
 if (checkdnsrr($domain, 'A') || checkdnsrr($domain, 'AAAA')) return [$domain];
 return [];
}

function smtp_rcpt_check($email, $from = 'verify@yourdomain.com', $timeout = 5) {
 if (!filter_var($email, FILTER_VALIDATE_EMAIL)) {
 return ['status'=>'invalid','reason'=>'bad_syntax'];
 }
 $domain = substr(strrchr($email, "@"), 1);
 // IDN handling if needed
 if (function_exists('idn_to_ascii')) {
 $domain = idn_to_ascii($domain, 0, INTL_IDNA_VARIANT_UTS46);
 }
 $mxs = get_mx_hosts($domain);
 if (empty($mxs)) return ['status'=>'undeliverable','reason'=>'no_mx'];

 foreach ($mxs as $mx) {
 $errno = 0; $errstr = '';
 $fp = @stream_socket_client("tcp://{$mx}:25", $errno, $errstr, $timeout);
 if (!$fp) continue;
 // Read banner
 $banner = smtp_read_response($fp, $timeout);
 fwrite($fp, "EHLO example.com\r\n");
 smtp_read_response($fp, $timeout);
 fwrite($fp, "MAIL FROM:<{$from}>\r\n");
 smtp_read_response($fp, $timeout);
 fwrite($fp, "RCPT TO:<{$email}>\r\n");
 $rcpt = smtp_read_response($fp, $timeout);
 fwrite($fp, "QUIT\r\n");
 fclose($fp);

 if (preg_match('/^([0-9]{3})/m', $rcpt, $m)) {
 $code = intval($m[1]);
 if (in_array($code, [250, 251])) {
 return ['status'=>'deliverable','mx'=>$mx,'code'=>$code,'response'=>trim($rcpt)];
 }
 if (in_array($code, [550, 551, 553])) {
 return ['status'=>'undeliverable','mx'=>$mx,'code'=>$code,'response'=>trim($rcpt)];
 }
 // 450/451/452 => temporary failure / greylist / unknown
 return ['status'=>'unknown','mx'=>$mx,'code'=>$code,'response'=>trim($rcpt)];
 } else {
 return ['status'=>'unknown','mx'=>$mx,'response'=>trim($rcpt)];
 }
 }
 return ['status'=>'unknown','reason'=>'no_connection'];
}

// Wrapper that runs common checks and returns best-effort outcome
function verify_email_precheck($email, $from='verify@yourdomain.com') {
 // 1) Syntax
 if (!filter_var($email, FILTER_VALIDATE_EMAIL)) return ['status'=>'invalid','reason'=>'syntax'];

 // 2) Quick disposable domain check (use a full db in prod)
 $disposable = ['mailinator.com','10minutemail.com']; // example only
 $domain = strtolower(substr(strrchr($email,'@'),1));
 if (in_array($domain, $disposable)) return ['status'=>'undeliverable','reason'=>'disposable'];

 // 3) MX + optional SMTP RCPT probe
 $smtp = smtp_rcpt_check($email, $from, 8);
 return $smtp;
}

How to interpret results:

  • status = deliverable → server accepted RCPT (likely exists but not guaranteed).
  • status = undeliverable → server explicitly rejected address (likely does not exist).
  • status = unknown → temporary failure / greylist / blocked — send verification email to be sure.

Remember: Gmail sometimes accepts RCPT but later bounces, or it may defer/greylist. Community notes and tools (for example: https://github.com/hbattat/verifyEmail and https://github.com/reacherhq/check-if-email-exists) show that Gmail behavior can be inconsistent; treat SMTP checks as heuristic only.


Using the Gmail API — limitations

If you control a Google Workspace domain and have domain‑wide authority, some API methods help manage or verify aliases — for instance, the Gmail API method to verify a send‑as alias sends a verification email, but it does not let you probe arbitrary @gmail.com addresses for existence: https://developers.google.com/workspace/gmail/api/reference/rest/v1/users.settings.sendAs/verify. In short: there is no public Google API you can call to check whether any random username@gmail.com exists without user consent.


Best practices: verification, deliverability, UX

What to do in production

  • Always confirm via a tokenized verification email for critical flows (account creation, password reset). That’s the only reliable confirmation.
  • Use a transactional email provider (SendGrid, Postmark, Mailgun) for good deliverability and reliable bounce handling. Monitor bounces and suppress permanently failed addresses. PostmanSMTP and other deliverability writeups explain Gmail’s protections and how they affect sending: https://postmansmtp.com/gmail-blocking-emails-issue/.
  • Protect user privacy: don’t reveal “this email exists” in public-facing UI. Avoid account enumeration (e.g., don’t show “no such account” on forgotten‑password screens; use neutral messaging).
  • Implement progressive verification: accept the email, mark as unverified, allow the user to continue limited flow, and require click confirmation for sensitive actions.
  • Validate syntax, check MX, detect disposable domains, run an optional SMTP probe — but assume the probe is heuristic. Keep a safe fallback: require click verification.

Deliverability checklist (quick):

  • Set SPF, DKIM, DMARC for your sending domain.
  • Use a reputable sending IP/domain and warm it up.
  • Track bounces and complaints and remove bad addresses.
  • Respect rate limits and backoff on 4xx/5xx responses.

How to treat ambiguous results in production

Practical policies:

  • If SMTP check → deliverable: mark as likely good, but still send verification email before granting full access.
  • If SMTP check → undeliverable: block or ask user for a different address.
  • If SMTP check → unknown / network error / blocked: treat as unverified and require email confirmation. Do not assume “available”.
  • If your business flow must decide immediately (e.g., sign‑up throttle): prefer false negatives (ask for confirmation) over false positives (creating accounts for addresses you can’t reach).

User UX tip: show friendly copy — “We’ve sent a verification code to username@gmail.com. Please click the link to continue.” Don’t reveal whether an address already has an account elsewhere.


Quick comparison: methods pros / cons


Sources

  1. How to fix Gmail Blocking Emails Issue — PostmanSMTP
  2. Why emails are blocked by Gmail and how to resolve? — Jivrus
  3. How to check if an email address is real or valid using PHP — Stack Overflow
  4. GitHub - hbattat/verifyEmail
  5. How to programmatically check if a Gmail email address exists — Stack Overflow
  6. GitHub - reacherhq/check-if-email-exists
  7. Gmail API: users.settings.sendAs.verify — Google Developers
  8. Verify Gmail Address via SMTP Using PHP — FormGet
  9. Validate an Email Address with PHP — Labnol

Conclusion

For a reliable gmail check in PHP, stop trying to scrape Google’s sign‑in page — it’s designed to block that. Use a layered approach: syntax + MX + disposable checks, optional SMTP RCPT as a heuristic, and always confirm by sending a verification email (or use Workspace APIs for accounts you administer). If your cURL flow returns ambiguous or error responses, treat them as unknown (not “available”) and require the user to verify by clicking a link in their inbox.

Authors
Verified by moderation
Moderation
Gmail Check: Verify Email Availability in PHP (SMTP & MX)