Fix geocoder returning empty when a longer prefix lacks the requested language by gaoflow · Pull Request #323 · daviddrysdale/python-phonenumbers

gaoflow · 2026-06-25T12:09:00Z

Problem

_prefix_description_for_number (in prefix.py) immediately returned an
empty string when a longer (more-specific) prefix existed in the geocoding
data but did not carry an entry for the requested language. This prevented
the loop from falling through to shorter prefixes that do have the language
data, causing description_for_number to fall back all the way to the country
name.

Concrete example

Finnish Helsinki-area numbers (+3589x…) are described as 'Finland' instead
of 'Helsinki' when English or Finnish is requested:

>>> import phonenumbers
>>> from phonenumbers import geocoder
>>> num = phonenumbers.parse("+358912345678")
>>> geocoder.description_for_number(num, "en")
'Finland'   # ← wrong; expected 'Helsinki'
>>> geocoder.description_for_number(num, "fi")
'Suomi'     # ← wrong; expected 'Helsinki'

Root cause: the production geocoding data contains a 5-digit prefix
'35891' with only {'se': 'Helsingfors'} (Swedish via the 'se' locale key),
and a 4-digit prefix '3589' with {'en': 'Helsinki', 'fi': 'Helsinki', 'sv': 'Helsingfors'}. Because '35891' is the longer (more-specific) match
and is found first, the old code returned '' without ever reaching '3589'.

The same defect affects a handful of other prefixes (DRC, Armenia, Italy,
Taiwan) where a sub-prefix carries an alternative-language-only entry while the
parent prefix has the common languages.

Fix

Remove the else: return U_EMPTY_STRING branch so that the loop continues to
shorter prefixes when _find_lang returns None. The function still returns
U_EMPTY_STRING at the end of the loop when no prefix at all has usable data.

-            if name is not None:
-                return name
-            else:
-                return U_EMPTY_STRING
+            if name is not None:
+                return name
+            # Language not available for this prefix; try a less-specific (shorter) prefix.

Tests

A regression test (testPrefixFallsBackToShorterPrefixWhenLanguageMissing)
is added to geocodertest.py. It injects a synthetic longer-prefix entry
into the test geodata (one that has only German) and verifies that a French
request correctly falls through to the shorter prefix's English data. All
280 existing tests continue to pass.

This pull request was prepared with the assistance of AI, under my direction and review.

_prefix_description_for_number immediately returned an empty string when a longer prefix was found in the geocoding data but did not carry an entry for the requested language. This prevented the loop from trying shorter (less specific) prefixes that do have the language data. For example, Finnish Helsinki-area numbers (+3589x…) were mapped to 'Finland' rather than 'Helsinki' when English or Finnish was requested, because the 5-digit prefix '35891' exists with only a Swedish ('se') entry while the 4-digit prefix '3589' carries 'en', 'fi', and 'sv' data. Fix: instead of returning U_EMPTY_STRING when _find_lang yields None, allow the loop to continue to shorter prefixes. Add a regression test that drives this path using synthetic test geodata so that the fix is verifiable without depending on the full production data files.

daviddrysdale

Thank you for spotting and fixing!

daviddrysdale · 2026-06-25T15:31:51Z

+        TEST_GEOCODE_DATA['12123'] = {'de': u("TestDistrict")}
+        try:
+            # US_NUMBER3 is +1 212-812-0000; its prefix digits start with '12128' so
+            # it won't hit '12123'.  Use a fabricated number whose digits begin '12123'.


nit: the first sentence of the comment is unhelpful, please remove it.

daviddrysdale · 2026-06-25T15:32:23Z

        self.assertEqual("South Korea", description_for_number(KO_MOBILE, _ENGLISH))

+    def testPrefixFallsBackToShorterPrefixWhenLanguageMissing(self):
+        # When a longer prefix exists in the geocoding data but lacks the requested


Please could you add "Python version extra test" at the top of the comment as reminder that this has no equivalent Java test.

daviddrysdale approved these changes Jun 25, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix geocoder returning empty when a longer prefix lacks the requested language#323

Fix geocoder returning empty when a longer prefix lacks the requested language#323
gaoflow wants to merge 1 commit into
daviddrysdale:devfrom
gaoflow:fix-geocoder-prefix-language-fallback

gaoflow commented Jun 25, 2026

Uh oh!

daviddrysdale left a comment

Uh oh!

daviddrysdale Jun 25, 2026

Uh oh!

daviddrysdale Jun 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

gaoflow commented Jun 25, 2026

Problem

Concrete example

Fix

Tests

Uh oh!

daviddrysdale left a comment

Choose a reason for hiding this comment

Uh oh!

daviddrysdale Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

daviddrysdale Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants