Jump to content


Photo
- - - - -

Your mission should you choose to accept it...


  • Please log in to reply
19 replies to this topic

#1 ntheory

ntheory

    data pillager

  • Agents of the Revolution
  • 1,757 posts

Posted 11 May 2003 - 12:19 PM

Open a 2+ GB text file. Any suggestions? I know it's ridiculous but I seem to be bumping into this problem a lot (not quite this bad) where editors just refuse to open large files. biew (Binary vIEW check it out of you need a great hex editor/disassembler/etc) used to do a great job on everything but it won't even read this.

I could break the file up into pieces but since it's going to happen a lot I'd like to have a way to do it all in one shot. All the editor needs to do is open the file and let me search for a case sensitive string (grep works, but I lose a lot of freedom in moving around the file).

#2 White_Raven

White_Raven

    That's so raven!

  • Banned
  • 1,597 posts

Posted 11 May 2003 - 11:00 PM

If your under a *nix propt this shouldnt be a problem.. but if I where you i would check out programmers file editer at http://www.lancs.ac....ople/cpaap/pfe/

The thing is you should be able to open a large file in vi or nano inder linux... are you under windows? if so that link may help

#3 ntheory

ntheory

    data pillager

  • Agents of the Revolution
  • 1,757 posts

Posted 12 May 2003 - 12:04 AM

I'm running Debian Linux. vi, bvi, xemacs, emacs, and biew won't even touch it. Usually vi will try to load it until it runs out of memory (it always loads the whole file into memory) but this time it just died immediately.

I'll try out pfe. Thanks.

Edit:

Er, no I won't. I glazed over the fact that it was for Windows. Back to the drawing board. :-/

#4 White_Raven

White_Raven

    That's so raven!

  • Banned
  • 1,597 posts

Posted 12 May 2003 - 05:01 AM

Well like I said its a winho prog.. hve you tried googling for "linux file editer" or "large file support" ?

#5 ntheory

ntheory

    data pillager

  • Agents of the Revolution
  • 1,757 posts

Posted 12 May 2003 - 10:03 AM

I tried all of that a while back. biew was the only one that did it but I think it can't do beyond 2 GB no matter how much RAM you have.

Oh well.

#6 White_Raven

White_Raven

    That's so raven!

  • Banned
  • 1,597 posts

Posted 12 May 2003 - 06:42 PM

Hmm.. Maybe its a limitation of the stdio libs for linux? if so then updating them would be the best option. I really dont know that much about large file support, although rigth now I need something to help with backups of a files several gigs in size..

#7 LeeBoy

LeeBoy

    elite

  • Members
  • 111 posts

Posted 12 May 2003 - 07:42 PM

Just wondering what kid of text file is 2+ GB???

:P

#8 StankDawg

StankDawg

    same old Dawg, no new tricks

  • Moderating Team
  • 8,073 posts
  • Country:
  • Gender:Male

Posted 12 May 2003 - 09:05 PM

server log files, I would imagine.

Databases can tend to be very large in some circumstances as well.

My, that is a big font you have! :P

#9 ntheory

ntheory

    data pillager

  • Agents of the Revolution
  • 1,757 posts

Posted 12 May 2003 - 09:27 PM

server log files, I would imagine.

Databases can tend to be very large in some circumstances as well.

My, that is a big font you have! :P

Stank is on the money there. I was creating another database to tie into my map database. It's basically a copy of part of the FCC licensing database.

Since the FCC formats their data so badly (CSV with the pipe character delimiting with CARRIAGE RETURNS IN THE MIDDLE OF LINES) I had to write a Perl script to fix it. The Perl script can't fix all of their mistakes so while I was doing the DB inserts I logged it all to a file.

After all is said and done I have 1.8+ million successful insertions with about 300 failures. I was trying to see if there was some easy way to browse the file and catch a pattern in the errors. Doing it with grep is possible but doesn't solve the problem efficiently when there are lots of failures. Using grep I determined that the failures were on records that I'm not that upset about missing anyway.

I'll show you guys the project when it's finished. It's nothing special, just fun.

#10 White_Raven

White_Raven

    That's so raven!

  • Banned
  • 1,597 posts

Posted 12 May 2003 - 09:37 PM

The files Im trying to backup are tar images of peoples /home's - I have found that after taring the entire directory I it will not let itself be compressed, and I cant do much after that.. if you know of a solution by all means please let me know!

#11 ntheory

ntheory

    data pillager

  • Agents of the Revolution
  • 1,757 posts

Posted 13 May 2003 - 07:52 AM

Hmm... I haven't run across that before. Have you tried doing the compression "in-line" by doing "tar cjvf archive.tar.bz2 directory" or "tar czvf archive.tar.gz directory"?

Yeah, I don't have any experience with that so those are only guesses.

#12 LeeBoy

LeeBoy

    elite

  • Members
  • 111 posts

Posted 13 May 2003 - 06:52 PM

Thanks for clearing that up

One more question

When the logs are that big do you use a program
to generate a report or do you read through them?

#13 ntheory

ntheory

    data pillager

  • Agents of the Revolution
  • 1,757 posts

Posted 13 May 2003 - 07:33 PM

I'm pretty low-tech when it comes to my databases. I don't have any fancy analysis tools. I write a bunch of Perl scripts, insert the data, and do some basic checks to make sure that it worked.

Usually if the log file is small enough I just open it in vi and scan it for errors (with regular expressions).

So far it seems that my system works pretty well but I haven't tested this latest database yet.

#14 LeeBoy

LeeBoy

    elite

  • Members
  • 111 posts

Posted 13 May 2003 - 09:27 PM

WoW thats great

When I get enough time to
play with a server I may look you up for some help

#15 White_Raven

White_Raven

    That's so raven!

  • Banned
  • 1,597 posts

Posted 14 May 2003 - 12:37 AM

grep is a mans best friend.

#16 Zapperlink

Zapperlink

    "I Hack, therefore, I am"

  • Agents of the Revolution
  • 951 posts
  • Country:
  • Gender:Not Telling

Posted 14 May 2003 - 01:05 AM

lol a real man would break out his dot matrix printer and print it :)

#17 White_Raven

White_Raven

    That's so raven!

  • Banned
  • 1,597 posts

Posted 14 May 2003 - 05:05 AM

Dot matrix? Ya right try teletype tape :P :ninja:

#18 ntheory

ntheory

    data pillager

  • Agents of the Revolution
  • 1,757 posts

Posted 14 May 2003 - 07:13 AM

Damn, a dot matrix printer? That's too hardcore for me. I'd probably also have to eliminate most of the forests on Earth to print this thing. :P

You can see the beta version with no FCC data plotted yet if you like. It's a mapping system I built from the ground up last year. So far it just draws some basic map features but it has all of the data in it to do street names and tons of other things. I just haven't finished it because I was distracted by other projects. Every once in a while I go back and mess with it some more.

My goal with this latest update is to plot the location of cell sites, etc by using the FCC ULS database. It'll be easy once I get off my ass and do it. :P

p.s. If you go there and backtrack into the webcam directory don't bother trying to move the camera. I haven't hooked it back up in a long time. :( Also, if you request a couple of maps that are really large you may end up waiting a few minutes for them to be generated. Some of these queries return hundreds of thousands of rows (if not millions).

#19 White_Raven

White_Raven

    That's so raven!

  • Banned
  • 1,597 posts

Posted 15 May 2003 - 04:46 AM

now that mapping system is cool.. is it gpl?


(spelling)

#20 ntheory

ntheory

    data pillager

  • Agents of the Revolution
  • 1,757 posts

Posted 15 May 2003 - 08:27 AM

Thanks. :)

Not yet. But it will be when I've added all the cool toys to it. Unfortunately I foresee a small market since it'll probably take about 40GB once it's finished. <_<




BinRev is hosted by the great people at Lunarpages!