AnsweredAssumed Answered

vrf *** Splitting a file using VEE ***

Question asked by rsb on Feb 12, 2004
"Shawn Fessenden" <shawn@testech-ltd.com> wrote:
> > Has anyone tried to split a file into user
> > defined sizes using VEE?
>
> Yup. Read the entire thing at once (very fast). Use a UInt8 array for the
> destination. Then write out the chunk files directly from the array without
> moving or re-reading anything. Like:
>
> Read myFile size 100 bytes, write chunks of 20 bytes.
>

Given the discussion concerning "split" I was curious so did
a little experimentation and have some interesting results to share.

Problem: split a 600MB file into 2 300MB files given limited RAM

Solution: read the data in chunks and write in chunks using a loop.
          Essentially what Shawn suggests but not reading in the whole
          thing at once. Instead use a loop that reads a chunk,
          then writes it out and repeats until done.

Anyhow here's some data that shows total time taken vs chunk size.
My interpretation says that with smaller chunks we are seeing the
overhead of an interpreted language, whereas with the larger chunks
we are seeing various issues such as cache sizes, memory access
speeds, buss rates and so on. For my system at least the "sweet spot"
appears to be around 50kB chunk sizes, with relatively small
penalty for going bigger but very large penalty for going smaller.
This would likely vary on different systems.

# bytes       total time
per chunk     in sec

1000          833
2154          460
4642          221
10k            98
21.54k         47
46.42k         29
100k           32
215.4k         41
464.2k         58
1M             63
2.154M         68
4.624M         74
10M            70
21.54M         71
46.24M         76
100M           68

If anyone is curious it might be interesting to see similar data on
other systems.

If this trend holds a "split" with large files may indeed be very
practical in VEE. Don't have compiled code data but suspect it won't
be very much faster than 30sec

regards

Stan


--------------------------------------------------------------------------
Stan Bischof  Agilent Technologies  707-577-3994  stan_bischof@agilent.com
--------------------------------------------------------------------------

---
You are currently subscribed to vrf as: rsb@soco.agilent.com
To subscribe send a blank email to "join-vrf@it.lists.it.agilent.com".
To unsubscribe send a blank email to "leave-vrf@it.lists.it.agilent.com".
To send messages to this mailing list,  email "vrf@agilent.com". 
If you need help with the mailing list send a message to "owner-vrf@it.lists.it.agilent.com".
Send your favorite VEE example to "VRF-EXAMPLES@agilent.com" for possible inclusion in VEE 7.0!

Outcomes