Archive for Programming

How to read the zip content in php?

To read the zip content in php, you should have your zip extension enabled first. The extension is located in /php/ext directory. To enable it add this line to php.ini file.
extension = php_zip.dll

Once you have your extension enabled, you could use the php built in function to open and read the zip content. Here is the small example to read multiple files contained in the zip.

 function read_zip_file($zipfile) {
 $zip = zip_open($zipfile);

 if(is_resource($zip)) {
	while(($zip_entry = zip_read($zip))) {
		$filename = basename(zip_entry_name($zip_entry));
		$entry_content = zip_entry_read($zip_entry, zip_entry_filesize($zip_entry));

		$each_file_content[$filename] = $entry_content;
	}
 }else {
	echo 'Invalid Zip Format';
 }
}

How to read .docx file in php?

.docx file is Microsoft Office Open XML Format Document. This is the new format for Microsoft Office documents (2007). It is a combination of XML architecture and ZIP compression for size reduction. The function below takes the filename as parameter and return the content of the .docx file.

function read_file_docx($filename){

    $striped_content = '';
    $content = '';

    if(!$filename || !file_exists($filename)) return false;

    $zip = zip_open($filename);

    if (!$zip || is_numeric($zip)) return false;

    while ($zip_entry = zip_read($zip)) {

        if (zip_entry_open($zip, $zip_entry) == FALSE) continue;

        if (zip_entry_name($zip_entry) != "word/document.xml") continue;

        $content .= zip_entry_read($zip_entry, zip_entry_filesize($zip_entry));

        zip_entry_close($zip_entry);
    }// end while

    zip_close($zip);

    $content = str_replace('</w:r></w:p></w:tc><w:tc>', " ", $content);
    $content = str_replace('</w:r></w:p>', "\r\n", $content);
    $striped_content = strip_tags($content);

    return $striped_content;
}

How to read .doc file in php?

A while back I came accross a situation where I have to read .doc and .docx file in php. I could not use the file function from php so I did the little googling and found few functions to do that which I want to share over here.

Since .doc and .docx file are not plain text. It cannot be read in php using the function file_get_contents() or file().

.doc file are basically Microsoft Word Binary File Format. Binary DOC files often contain more text formatting information. The function below takes the filename as a parameter and returns the content of the .doc file.

function read_doc_file($filename) {
     if(file_exists($filename))
    {
        if(($fh = fopen($filename, 'r')) !== false ) 
        {
           $headers = fread($fh, 0xA00);

           // 1 = (ord(n)*1) ; Document has from 0 to 255 characters
           $n1 = ( ord($headers[0x21C]) - 1 );

           // 1 = ((ord(n)-8)*256) ; Document has from 256 to 63743 characters
           $n2 = ( ( ord($headers[0x21D]) - 8 ) * 256 );

           // 1 = ((ord(n)*256)*256) ; Document has from 63744 to 16775423 characters
           $n3 = ( ( ord($headers[0x21E]) * 256 ) * 256 );

           // 1 = (((ord(n)*256)*256)*256) ; Document has from 16775424 to 4294965504 characters
           $n4 = ( ( ( ord($headers[0x21F]) * 256 ) * 256 ) * 256 );

           // Total length of text in the document
           $textLength = ($n1 + $n2 + $n3 + $n4);

           $extracted_plaintext = fread($fh, $textLength);

            // if you want to see your paragraphs in a new line, do this
           return nl2br($extracted_plaintext);
           // need more spacing after each paragraph use another nl2br
        }
    }   
    }
000webhost logo