Skip to content
February 14, 2009 / steve8

palindrome Pangram Puzzle – a work in progress

Definitions:

A palindrome is a sequence of words like "lid off a daffodil" or "shallot ayatollahs" that uses the same letters reading backwards as forwards.

A pangram is a sequence of words that contains every letter of the alphabet (at least once).

Puzzle:

Find the shortest sequence of words that is both a pangram and a palindrome.

The sequence need not form a meaningful or grammatical sentence.

Use this dictionary: WORD.LST

from ITA Software

is this solvable with 1 man’s resources and timeframe?

Thought #1:

Let’s just try every single combination of letters, check if its palindromic and pangramic, then if so, check if its made up of words in the dictionary…

Problem with thought #1:

That’s fine except the minimum number of letters in a pangram is 26 and a palindromic pangram would be at least 51 (like ‘abcdefghijklmnopqrstuvwxyzyxwvutsrqponmlkjihgfedcba’)

with 26 options for each letter position thats 26^51=~ 1.45*10^72

Keep in mind a high end computer today will process about 50 billion instructions per second, that is ~1.5*10^17 instructions per year…    and a computer will not be able to check if an array of 51 letters can be created with a 170,000 word list in 1 instruction.

Thought #1a:

Let’s just try every single palindromic pangram we can make, then check if it can be created with words from the dictionary.

Problem with thought #1a:

That’s fine except, just for the 51 letter scenario, with our 26 letter alphabet:

there would be:

26 choices for the first letter, 25 for the second, 24 for the third… etc.

26*25*24*23*22*21*20*19*18*17*16*15*14*13*12*11*10*9*8*7*6*5*4*3*2*1

those are the choices for the first 26 letters , there are no decisions to make for the next 25 in the sequence of letters since we are assuming it is a palindrome, so it must be a mirror from left to right.

that means we would have 26! (factorial) palindromic pangrams to check.

and the ‘check’ is non-trivial, checking a 51 character sequence of letters to see if it can be created with a 170,00 word dictionary.

26!=~4*10^26 which is far better than 1.45*10^72, but when we are dealing with processors that can only do 1.5*10^17 instructions per year.

Thought #2:

Let’s just construct every 51 letter sentence that we can with the dictionary, then check if its a palindrome and/or a pangram.

Problem with thought #2:

Well for the sake of the point lets say all 170,000 words are 3-letters.

then 51 characters would be 17 words, and each word would have 170,000 possibilities.

That’s 170,000^17=~ 8.27*10^88, even more things to check than idea #1!

the exact number of combinations is a much more complex topic, since all words are not 3-letters, but that’s the order of it, 8*10^88, way too many.

Thought #3: (getting desperate)

Let’s throw out all redundant words to try to reduce the dictionary.

Example:

‘human’ is a redundant word since it can be created with ‘hu’ and ‘man’ which are both words in the dictionary.

So ‘humane’ is totally redundant for this problem since smaller words can make it if needed, and there is nothing that we can create with ‘humane’ that we can’t create with ‘hu’ ‘man’

thoughts on thought #3:

for each word, to check if its redundant we must check if the first 1 letter of the word is a word in the wordlist, if the first 2 letters is a word in the wordlist, if the first 3 letters is a word in the word list… and so on

if ever one of the answer is yes, then you take the next 1 letter, the next 2 letters etc until you find a combination of words that can create the word exactly

like humane:

does ‘h’ exist in wordlist (no)

does ‘hu’ exist in wordlist (yes)

->does ‘m’ exist in wordlist (no)

->does ‘ma’ exist in wordlist (yes)

->->does ‘n’ exist in wordslit (no)

->->does ‘ne’ exist in wordlist (yes)

no more letters in the word, done.

so there we only had to search the wordlist 6 times

which isn’t so bad but we have 170,000 words in the wordlist,

>There will be a lot of searching.

since my wordlist is ordered I can do a binary search, which would require about 18 checks per search, each check would be to compare alphabetically the word we are searching for with the one in the list we are checking… that is not so bad.

a better method to search:

I can create an list of lists of words,

so the first element of the list contains a list of words, all starting with ‘aa’

the 2nd element of the list contains a list of words, all starting with ‘ab’

the 26^2 th element of the array contains a list of words, all starting with ‘zz’

if the words were evenly distributed across all 2-char starting strings, there would be 251 words starting with each 2-char string.

now the task to find a word is even easier, to find the word ‘hello’ we simply go to the index ‘he’ which has a numeric value equal to the position in the alphabet of each character h=7, e=4, (a=0), in base 26 that’s 74, in decimal we are looking at index 186, that little calculation takes almost no time, and then searching the resulting list of ~251 words with binary search would only be around 8 checks, each alphabetically comparing a word to another… but in lists so small, checking word by word is simpler, and empirically seems to be faster.

Result from thought #3

This reduces the list from 173,528 to 76,421

My little computer did this in 33 seconds with python.

 

 

Thought #3b:

other words that should never be considered are those that cannot be reversed… in other words, words who’s mirror can never be created using the word list.  (these words will never successfully create a palindrome)

 

initially this might seem relatively simple..

take two words

humanity and humanization.

 

if you are going to use either word in a palindrome, you must be able to create the reverse string somehow.

it doesn’t mean the words ‘ytinamuh’ or ‘noitazinamuh’ must exist,

for example the words ‘

‘poetry’,‘tin’,‘amu’,‘hu’ when concatenated is poetrytinamuhu

which of course contains ‘ytinamuh’:

poetrytinamuhu

so this word should be kept in the list, as it may be used in a palindrome.

 

Results from thought #3b:

pass 1:

38 minutes to reduce the wordlist from 76,421 to 39,822 words

pass 2:

9.5 minutes to reduce the wordlist from 39,822 to 39,265

pass 3:

no reduction, done.

 

 

I’m not done…

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: