English 中文(简体)
Split a text file in PHP
原标题:

How can I split a large text file into separate files by character count using PHP? So a 10,000 character file split every 1000 characters would be split into 10 files. Further, can I split only after a full stop is found?

Thanks.

UPDATE 1: I like zombats code and I removed some errors and have come up with the following, but does anyone know how to only split after a full stop?

$i = 1;
    $fp = fopen("test.txt", "r");
    while(! feof($fp)) {
        $contents = fread($fp,1000);
        file_put_contents( new_file_ .$i. .txt , $contents);
        $i++;
    }

UPDATE 2: I took zombats suggestion and modified the code to that below and it seems to work -

$i = 1;
    $fp = fopen("test.txt", "r");
    while(! feof($fp)) {
        $contents = fread($fp,20000);
        $contents .= stream_get_line($fp,1000,".");
        $contents .=".";

        file_put_contents("Split/".$tname."/"."new_file_".$i.".txt", $contents);
        $i++;
    }
最佳回答

You should be able to accomplish this easily with a basic fread(). You can specify how many bytes you want to read, so it s trivial to read in an exact amount and output it to a new file.

Try something like this:

$i = 1;
$fp = fopen("test.txt", r );
while(! feof($fp)) {
    $contents = fread($fp,1000);
    file_put_contents( new_file_ .$i. .txt ,$contents);
    $i++;
}

EDIT

If you wish to stop after a certain amount of length OR on a certain character, then you could use stream_get_line() instead of fread(). It s almost identical, except it allows you to specify any ending delimiter you wish. Note that it does not return the delimeter as part of the read.

$contents = stream_get_line($fp,1000,".");
问题回答

There is a bug in run function; variable $split not defined.

The easiest way is to read the contents of the file, split the content, then save to two other files. If your files are more than a few gigabytes, you re going to have a problem doing it in PHP due to integer size limitations.

You could also write a class to do this for you.

<?php

/**
* filesplit class : Split big text files in multiple files
*
* @package
* @author Ben Yacoub Hatem <hatem@php.net>
* @copyright Copyright (c) 2004
* @version $Id$ - 29/05/2004 09:02:10 - filesplit.class.php
* @access public
**/
class filesplit{
    /**
     * Constructor
     * @access protected
     */
    function filesplit(){

    }

    /**
     * File to split
     * @access private
     * @var string
     **/
    var $_source =  logs.txt ;

    /**
     *
     * @access public
     * @return string
     **/
    function Getsource(){
        return $this->_source;
    }

    /**
     *
     * @access public
     * @return void
     **/
    function Setsource($newValue){
        $this->_source = $newValue;
    }

    /**
     * how much lines per file
     * @access private
     * @var integer
     **/
    var $_lines = 1000;

    /**
     *
     * @access public
     * @return integer
     **/
    function Getlines(){
        return $this->_lines;
    }

    /**
     *
     * @access public
     * @return void
     **/
    function Setlines($newValue){
        $this->_lines = $newValue;
    }

    /**
     * Folder to create splitted files with trail slash at end
     * @access private
     * @var string
     **/
    var $_path =  logs/ ;

    /**
     *
     * @access public
     * @return string
     **/
    function Getpath(){
        return $this->_path;
    }

    /**
     *
     * @access public
     * @return void
     **/
    function Setpath($newValue){
        $this->_path = $newValue;
    }

    /**
     * Configure the class
     * @access public
     * @return void
     **/
    function configure($source = "",$path = "",$lines = ""){
        if ($source != "") {
            $this->Setsource($source);
        }
        if ($path!="") {
            $this->Setpath($path);
        }
        if ($lines!="") {
            $this->Setlines($lines);
        }
    }


    /**
     *
     * @access public
     * @return void
     **/
    function run(){
        $i=0;
        $j=1;
        $date = date("m-d-y");
        unset($buffer);

        $handle = @fopen ($this->Getsource(), "r");
        while (!feof ($handle)) {
          $buffer .= @fgets($handle, 4096);
          $i++;
              if ($i >= $split) {
              $fname = $this->Getpath()."part.$date.$j.txt";
               if (!$fhandle = @fopen($fname,  w )) {
                    print "Cannot open file ($fname)";
                    exit;
               }

               if (!@fwrite($fhandle, $buffer)) {
                   print "Cannot write to file ($fname)";
                   exit;
               }
               fclose($fhandle);
               $j++;
               unset($buffer,$i);
                }
        }
        fclose ($handle);
    }


}
?>


Usage Example
<?php
/**
* Sample usage of the filesplit class
*
* @package filesplit
* @author Ben Yacoub Hatem <hatem@php.net>
* @copyright Copyright (c) 2004
* @version $Id$ - 29/05/2004 09:14:06 - usage.php
* @access public
**/

require_once("filesplit.class.php");

$s = new filesplit;

/*
$s->Setsource("logs.txt");
$s->Setpath("logs/");
$s->Setlines(100); //number of lines that each new file will have after the split.
*/

$s->configure("logs.txt", "logs/", 2000);
$s->run();
?>

Source http://www.weberdev.com/get_example-3894.html

I had fix class and work perfect with .txt file.

<?php

/**
* filesplit class : Split big text files in multiple files
*
* @package
* @author Ben Yacoub Hatem <hatem@php.net>
* @copyright Copyright (c) 2004
* @version $Id$ - 29/05/2004 09:02:10 - filesplit.class.php
* @access public
**/
class filesplit{
    /**
     * Constructor
     * @access protected
     */
    function filesplit(){

    }

    /**
     * File to split
     * @access private
     * @var string
     **/
    var $_source =  logs.txt ;

    /**
     *
     * @access public
     * @return string
     **/
    function Getsource(){
        return $this->_source;
    }

    /**
     *
     * @access public
     * @return void
     **/
    function Setsource($newValue){
        $this->_source = $newValue;
    }

    /**
     * how much lines per file
     * @access private
     * @var integer
     **/
    var $_lines = 1000;

    /**
     *
     * @access public
     * @return integer
     **/
    function Getlines(){
        return $this->_lines;
    }

    /**
     *
     * @access public
     * @return void
     **/
    function Setlines($newValue){
        $this->_lines = $newValue;
    }

    /**
     * Folder to create splitted files with trail slash at end
     * @access private
     * @var string
     **/
    var $_path =  logs/ ;

    /**
     *
     * @access public
     * @return string
     **/
    function Getpath(){
        return $this->_path;
    }

    /**
     *
     * @access public
     * @return void
     **/
    function Setpath($newValue){
        $this->_path = $newValue;
    }

    /**
     * Configure the class
     * @access public
     * @return void
     **/
    function configure($source = "",$path = "",$lines = ""){
        if ($source != "") {
            $this->Setsource($source);
        }
        if ($path!="") {
            $this->Setpath($path);
        }
        if ($lines!="") {
            $this->Setlines($lines);
        }
    }


    /**
     *
     * @access public
     * @return void
     **/
    function run(){

        $buffer =   ;
        $i=0;
        $j=1;
        $date = date("m-d-y");
        $handle = @fopen ($this->Getsource(), "r");

        while (!feof ($handle)) {

            $buffer .= @fgets($handle, 4096);
            $i++;

            if ($i >= $this->getLines()) { 

                // set your filename pattern here.
                $fname = $this->Getpath()."split_{$j}.txt";

                if (!$fhandle = @fopen($fname,  w )) {
                    print "Cannot open file ($fname)";
                    exit;
                }

                if (!@fwrite($fhandle, $buffer)) {
                    print "Cannot write to file ($fname)";
                    exit;
                }
                fclose($fhandle);
                $j++;
                unset($buffer,$i);
            }
        } 
        if ( !empty($buffer) && !empty($i) ) {
            $fname = $this->Getpath()."split_{$j}.txt";

            if (!$fhandle = @fopen($fname,  w )) {
                print "Cannot open file ($fname)";
                exit;
            }

            if (!@fwrite($fhandle, $buffer)) {
                print "Cannot write to file ($fname)";
                exit;
            }
                fclose($fhandle);
            unset($buffer,$i);  
        }
        fclose ($handle);
    }


}
?>

Usage Example

<?php

require_once("filesplit.class.php");

$s = new filesplit;
$s->Setsource("logs.txt");
$s->Setpath("logs/");
$s->Setlines(100); //number of lines that each new file will have after the split.
//$s->configure("logs.txt", "logs/", 2000);
$s->run();
?>

In the run function, I made the following adjustments, to fix the "split is not defined" warning.

function run(){

        $buffer=  ;
        $i=0;
        $j=1;
        $date = date("m-d-y");
        $handle = @fopen ($this->Getsource(), "r");

        while (!feof ($handle)) {

            $buffer .= @fgets($handle, 4096);
            $i++;

            if ($i >= $this->getLines()) { // $split was here, empty value..

                // set your filename pattern here.
                $fname = $this->Getpath()."dma_map_$date.$j.csv";

                if (!$fhandle = @fopen($fname,  w )) {
                    print "Cannot open file ($fname)";
                    exit;
                }

                if (!@fwrite($fhandle, $buffer)) {
                    print "Cannot write to file ($fname)";
                    exit;
                }
                fclose($fhandle);
                $j++;
                unset($buffer,$i);
            }
        }
        fclose ($handle);
    }

I used this class to split a 500,000 line CSV file into 10 files, so phpmyadmin can consume it without timeout issues. Worked like a charm.





相关问题
Simple JAVA: Password Verifier problem

I have a simple problem that says: A password for xyz corporation is supposed to be 6 characters long and made up of a combination of letters and digits. Write a program fragment to read in a string ...

Case insensitive comparison of strings in shell script

The == operator is used to compare two strings in shell script. However, I want to compare two strings ignoring case, how can it be done? Is there any standard command for this?

Trying to split by two delimiters and it doesn t work - C

I wrote below code to readin line by line from stdin ex. city=Boston;city=New York;city=Chicago and then split each line by ; delimiter and print each record. Then in yet another loop I try to ...

String initialization with pair of iterators

I m trying to initialize string with iterators and something like this works: ifstream fin("tmp.txt"); istream_iterator<char> in_i(fin), eos; //here eos is 1 over the end string s(in_i, ...

break a string in parts

I have a string "pc1|pc2|pc3|" I want to get each word on different line like: pc1 pc2 pc3 I need to do this in C#... any suggestions??

Quick padding of a string in Delphi

I was trying to speed up a certain routine in an application, and my profiler, AQTime, identified one method in particular as a bottleneck. The method has been with us for years, and is part of a "...

热门标签