Discussion:
Different topic
(too old to reply)
AlJones
2011-05-05 20:32:22 UTC
Permalink
(( and yes, I know I'm in the wrong group, but hope some of you might have
tried to do the same thing. ))

I'm now trying to read the header of a .LIT (MS eReader) to pull out the
author / title. MS's response has been that the structure is locked by
design. I've looked at the source for cvtlit but he seems to be interested
in only that part of the file which he can convert to text.

Does anyone have, or know of where I can find a way of retreiving
information from an ereader file.

//al
GS
2011-05-06 05:22:30 UTC
Permalink
Post by AlJones
(( and yes, I know I'm in the wrong group, but hope some of you might have
tried to do the same thing. ))
I'm now trying to read the header of a .LIT (MS eReader) to pull out the
author / title. MS's response has been that the structure is locked by
design. I've looked at the source for cvtlit but he seems to be interested
in only that part of the file which he can convert to text.
Does anyone have, or know of where I can find a way of retreiving
information from an ereader file.
//al
I'm thinking the info you want to extract is meta data that might be
found in File Properties. Google DsoFile.DLL and see what it can do for
you.
--
Garry

Free usenet access at http://www.eternal-september.org
ClassicVB Users Regroup! comp.lang.basic.visual.misc
AlJones
2011-05-06 16:45:35 UTC
Permalink
Post by GS
Post by AlJones
(( and yes, I know I'm in the wrong group, but hope some of you might have
tried to do the same thing. ))
I'm now trying to read the header of a .LIT (MS eReader) to pull out the
author / title. MS's response has been that the structure is locked by
design. I've looked at the source for cvtlit but he seems to be interested
in only that part of the file which he can convert to text.
Does anyone have, or know of where I can find a way of retreiving
information from an ereader file.
//al
I'm thinking the info you want to extract is meta data that might be
found in File Properties. Google DsoFile.DLL and see what it can do for
you.
Hadn't thought of that, I use the dsofile dll to read the title & author
already from doc and rtf so dropping the lit in to see would be easy.
Thanks for the suggestion and I'll report back on what I find.
(( have one other bug that is VERY time consuming - when the program is run
- that I have to straighten out first, so it'll probably be tomorrow before
I get back here.
Thanks for the suggestion //al

(( and of course it was my oversite when I did not say I was trying to
retrieve that information programatically, but in a programming group I
*really* didn't think I'd have to. ))
Dee Earley
2011-05-09 10:45:28 UTC
Permalink
Post by AlJones
and of course it was my oversite when I did not say I was trying to
retrieve that information programatically, but in a programming group I
*really* didn't think I'd have to.
You should still give explicit subjects, offtopic or not :)
--
Dee Earley (***@icode.co.uk)
i-Catcher Development Team
http://www.icode.co.uk/icatcher/

iCode Systems

(Replies direct to my email address will be ignored.
Please reply to the group.)
Al Jones
2011-05-09 16:07:24 UTC
Permalink
Post by Dee Earley
Post by AlJones
and of course it was my oversite when I did not say I was trying to
retrieve that information programatically, but in a programming group I
*really* didn't think I'd have to.
You should still give explicit subjects, offtopic or not :)
Criticism acknowledged and accepted; and since you've joined the
discussion, may I ask if you can add anything significant to it??
//al
Dee Earley
2011-05-09 17:38:34 UTC
Permalink
Post by Al Jones
Post by Dee Earley
Post by AlJones
and of course it was my oversite when I did not say I was trying to
retrieve that information programatically, but in a programming group I
*really* didn't think I'd have to.
You should still give explicit subjects, offtopic or not :)
Criticism acknowledged and accepted; and since you've joined the
discussion, may I ask if you can add anything significant to it??
No, I read it to see what it was about, but as it's off topic and
something I don't know anything about, I can't help further.

Sorry
--
Dee Earley (***@icode.co.uk)
i-Catcher Development Team
http://www.icode.co.uk/icatcher/

iCode Systems

(Replies direct to my email address will be ignored.
Please reply to the group.)
Mayayana
2011-05-06 13:18:01 UTC
Permalink
I had no idea that eReaders even had formats, but I
see here....

http://en.wikipedia.org/wiki/Comparison_of_e-book_formats#Microsoft_LIT

...that there are dozens of them. And I had no idea that
one doesn't even need to buy an eReader gadget to read
or stream restricted and/or spyware books. One can do it
on a PC. Such a relief...

At that page it says that .LIT is similar to .CHM, and that some
software can convert .LIT to HTML files. CHM files are actually
just a website in a compressed file. I don't know offhand what the
compression is, but the free 7-ZIP can open them. I downloaded
a LIT file and find that, indeed, 7-ZIP can open it. There are various
files inside, such as a manifest, TOC, JPG files, etc. But according to
7-ZIP, the DRM and content files are treated with an unknown
"EBEncrypt" method. When I extracted the LIT contents I found
that the two files 7-ZIP shows as encrypted have no content at all,
even though 7-ZIP shows them with content. I don't know whether
7-ZIP is just creating an empty file that it can't write to, or whether
Microsoft is somehow using calculated file corruption to make it
extract that way. I guess the former possibility is more likely.

So 7-ZIP seems to be able to give you everything *except*
the text. The program you mentioned seems to be Convert Lit,
not "cvtlit". That program is open source, so you should be
able to work out all the details if you want to. Convert Lit must
include both extraction and decrypting.


| (( and yes, I know I'm in the wrong group, but hope some of you might have
| tried to do the same thing. ))
|
| I'm now trying to read the header of a .LIT (MS eReader) to pull out the
| author / title. MS's response has been that the structure is locked by
| design. I've looked at the source for cvtlit but he seems to be interested
| in only that part of the file which he can convert to text.
|
| Does anyone have, or know of where I can find a way of retreiving
| information from an ereader file.
|
| //al
AlJones
2011-05-06 16:41:19 UTC
Permalink
I don't normally top post but to follow your lead, I will.
I'm writing a program to provide a catalogue of eBooks (and move/copy them
to subfolders by author/series) but in another variety of basic (ahem!).
I appreciate your effort, and will add the note that 7-zip knows, somewhat,
how to handé a lit file (an intersting piece of information).
The structure is "similar" to the chm, but with enough differences that one
can't follow the pointers through to get where I want to go.
And actually in his source, he refers to it as clit and in my mind the
MSDOS program is cvtlit and the 3rd Party GUI is Convertlit - ah well,
that's just the way my mind hangs things together, sorry if there was a
conversion factor there.
Thanks for your comments //al
Post by Mayayana
I had no idea that eReaders even had formats, but I
see here....
http://en.wikipedia.org/wiki/Comparison_of_e-book_formats#Microsoft_LIT
...that there are dozens of them. And I had no idea that
one doesn't even need to buy an eReader gadget to read
or stream restricted and/or spyware books. One can do it
on a PC. Such a relief...
At that page it says that .LIT is similar to .CHM, and that some
software can convert .LIT to HTML files. CHM files are actually
just a website in a compressed file. I don't know offhand what the
compression is, but the free 7-ZIP can open them. I downloaded
a LIT file and find that, indeed, 7-ZIP can open it. There are various
files inside, such as a manifest, TOC, JPG files, etc. But according to
7-ZIP, the DRM and content files are treated with an unknown
"EBEncrypt" method. When I extracted the LIT contents I found
that the two files 7-ZIP shows as encrypted have no content at all,
even though 7-ZIP shows them with content. I don't know whether
7-ZIP is just creating an empty file that it can't write to, or whether
Microsoft is somehow using calculated file corruption to make it
extract that way. I guess the former possibility is more likely.
So 7-ZIP seems to be able to give you everything *except*
the text. The program you mentioned seems to be Convert Lit,
not "cvtlit". That program is open source, so you should be
able to work out all the details if you want to. Convert Lit must
include both extraction and decrypting.
| (( and yes, I know I'm in the wrong group, but hope some of you might have
| tried to do the same thing. ))
|
| I'm now trying to read the header of a .LIT (MS eReader) to pull out the
| author / title. MS's response has been that the structure is locked by
| design. I've looked at the source for cvtlit but he seems to be interested
| in only that part of the file which he can convert to text.
|
| Does anyone have, or know of where I can find a way of retreiving
| information from an ereader file.
|
| //al
GS
2011-05-06 23:55:31 UTC
Permalink
Just to add to your info, Mayayana...

I have a utility with my ebook authoring software that lets me convert
CHMs to an ebook.exe. Do you think a LIT is similar enough that by
renaming its extension it will also convert? (Ebook is just a package
of web pages and a HTML reader)
--
Garry

Free usenet access at http://www.eternal-september.org
ClassicVB Users Regroup! comp.lang.basic.visual.misc
AlJones
2011-05-07 00:23:55 UTC
Permalink
Post by GS
Just to add to your info, Mayayana...
I have a utility with my ebook authoring software that lets me convert
CHMs to an ebook.exe. Do you think a LIT is similar enough that by
renaming its extension it will also convert? (Ebook is just a package
of web pages and a HTML reader)
The way that you've adressed this I believe it to be to Mayayana but just
in case it's not - it would serve my purposes not at all. And I'm still
trying to decided whether I can't count or whether I 'lost' a couple of
hundred files! Damn programming's a lot of fun!
GS
2011-05-07 00:42:03 UTC
Permalink
Post by AlJones
Post by GS
Just to add to your info, Mayayana...
I have a utility with my ebook authoring software that lets me convert
CHMs to an ebook.exe. Do you think a LIT is similar enough that by
renaming its extension it will also convert? (Ebook is just a package
of web pages and a HTML reader)
The way that you've adressed this I believe it to be to Mayayana but just
in case it's not - it would serve my purposes not at all. And I'm still
trying to decided whether I can't count or whether I 'lost' a couple of
hundred files! Damn programming's a lot of fun!
I was addressing Mayayana, not offering a suggestion for your issue. I
guess it qualifies as OT and so I'm sorry if it has distracted you.
--
Garry

Free usenet access at http://www.eternal-september.org
ClassicVB Users Regroup! comp.lang.basic.visual.misc
AlJones
2011-05-07 16:27:33 UTC
Permalink
Post by GS
Post by AlJones
Post by GS
Just to add to your info, Mayayana...
I have a utility with my ebook authoring software that lets me convert
CHMs to an ebook.exe. Do you think a LIT is similar enough that by
renaming its extension it will also convert? (Ebook is just a package
of web pages and a HTML reader)
The way that you've adressed this I believe it to be to Mayayana but just
in case it's not - it would serve my purposes not at all. And I'm still
trying to decided whether I can't count or whether I 'lost' a couple of
hundred files! Damn programming's a lot of fun!
I was addressing Mayayana, not offering a suggestion for your issue. I
guess it qualifies as OT and so I'm sorry if it has distracted you.
No distraction and I won't conside it off-topic. I just hate to accidently
offfend people.
//al
Mayayana
2011-05-07 03:45:57 UTC
Permalink
| I have a utility with my ebook authoring software that lets me convert
| CHMs to an ebook.exe. Do you think a LIT is similar enough that by
| renaming its extension it will also convert? (Ebook is just a package
| of web pages and a HTML reader)
|
I don't think there's much in common. They're the
same in that they're both basically ZIP files, like a
.docx. But if you open a LIT and a CHM in 7-ZIP you'll
see that the file names are all different. A CHM contains
webpages and files in one or more folders, plus specific
files that serve as TOC, index, etc.

In a LIT I find a file called "manifest" that lists included
JPG files and their file type; a file called "meta" that seems
to be meta-data: file name, date, a few GUIDs thrown in to
make it look official, etc. Then there's a folder named
"data" that contains all the JPGs. A subfolder there
contains a file named "ahc" that seems to be the TOC,
and a file named "content" that seems to be the actual
text, in encrypted form....

So it's all different. It's as though they got the idea from
CHMs, but beyond that there's nothing in common. (We were
talking about HXS last week. HXS is almost exactly the same
thing as CHM. They just changed some file syntax here and
there. But that's enough to break compatibility so that an
HXS can't be read as a CHM.)

It looks like a LIT can be retrieved fairly easily if the
decryption of the content file can be done. But AlJones doesn't
want that. I'm guessing what he wants is just the "meta" file.

I wonder if anyone who's willing to put up with DRM would
even care about being able to decrypt it. If they cared they
wouldn't pay for crippled product in the first place. (Most eBooks
are not even digital files. They're nothing more than streaming
access. A LIT file is a file, but then, if Microsoft can control
how you read it, perhaps even tracking your usage, then the
line between streaming and owning a file blurs.)

I wonder if it's even legal under the DRM laws in the US to
take apart something like a LIT file. As I understand it, the Digital
Millennium Copyright Act provides criminal penalties for anyone
who exercises their Fair Use rights on any media they buy,
if the seller has done anything (like encryption) to thwart that
Fair Use. By law you have a right to do anything with copyrighted
material that you like: copy it for your own use, lend it, use it in
any manner. The only thing you can't do is to distribute copies to
others. But by law corporations selling eBooks, music, etc. also
have a right to break the law by locking the product they sell to
you. So you have to break the law, reverse-engineering illegal
restrictions, in order to exercise your rights. (I don't know if
that's Kafka-esque or Orwellian. Either way, the only reasonable
and decent course of action, to my mind, is to simply refuse to
use or buy any DRM-tainted media.)
GS
2011-05-07 07:03:07 UTC
Permalink
Thanks, Mayayana. I don't know what DRM means. My ebooks are html
packaged in a lockable EXE. I'm looking into also doing same with PDFs.
I currently use ebook.EXEs as user guides in place of CHMs. I'd like to
have the option to do the same with PDFs. I have authoring software for
that but it's not lockable. I know that Adobe Digital Editions are
lockable, but I don't want to pay Adobe's price for the authoring
software.

Thanks for taking the time to provide so much detail. I really
appreciate you doing that!

I expect something like DsoFile.DLL is what AlJones is after since the
files he's looking at are compound docs by structure.
--
Garry

Free usenet access at http://www.eternal-september.org
ClassicVB Users Regroup! comp.lang.basic.visual.misc
Helmut_Meukel
2011-05-07 09:24:50 UTC
Permalink
Post by GS
I don't know what DRM means.
Digital Rights Management

Helmut.
Mayayana
2011-05-07 14:40:52 UTC
Permalink
| Thanks, Mayayana. I don't know what DRM means.

As Helmut said, it can mean digital rights management.
About as many people translate it as digital restriction
management, since that's what it is. "Digital rights
management" is misleading marketing. DRM doesn't grant
rights or manage rights. It restricts access. I guess it's a
lot like saying pro-life and pro-choice instead of the more
accurate anti-abortion and pro-abortion. People on both
sides of that issue will argue all day about why it's incorrect
and biased to call them what they are. The propagandist
euphemisms make both sides sound more moral and noble. :)

| My ebooks are html
| packaged in a lockable EXE. I'm looking into also doing same with PDFs.

A lockable EXE? You mean like password protected? The LIT
file I downloaded is just a compressed file. The restriction part is
the encryption.

You're doing that with software help manuals? I'm curious: Why do
they need to be locked? And why doesn't a CHM work? And...if
it's a restricted eBook, how do people read it? Do you ship your own
custom reader with your software? Then people get a password to
unlock the help...something like that?

PDFs are an entirely different approach. The format is open
and available, but it's extremely complex. I once delved into
writing a PDF parser and gave up. It was just too much work.
PDFs themselves are not lockable. The restriction is just a flag
saved in the file. It only functions if the program that opens
the PDF checks the flag and acts accordingly. Most PDF software
does follow the flag's directive, but it doesn't have to. It's actually
an arbitrary crippling that could just as easily be done with any
file format. I suppose the only reason that PDF readers build in
"cripple-respect" is because Adobe started that tradition
by positioning PDF as an opaque digital object to transfer corproate
letterheads and magazine excerpts from the source to a remote
printer. In other words, PDFs (and to some extent DOCs) provided
the illusion of paper-like palpability to a business world accustomed
to being awash in official documents printed on paper. Yet the
PDF file itself actually has no restrictions. It's immutability based on
mutually agreed illusion. :)

I've recompiled XPDF and SumatraPDF myself, to ignore the
restrictions flag. It only requires editing the code to not check the
flag. Both XPDF and Sumatra are OSS. I haven't made my code public
only because I wanted to respect the intentions of the OSS authors.

The problem here, to my mind, is that
a lot of people put restrictions on their PDFs for no good reason.
If I download a PDF and want to read it, I usually want to translate
it to plain text because PDF format is a pain in the neck. It's designed
for printing, not for reading. On the other hand, if I print it I usually
want to print it as plain text, too, so as not to waste ink on pointless
page decoration. In most cases there's no reason for limitations in that
regard.

Examples:
1) A few years ago I was in a car accident and downloaded the report
form. It was a PDF. I had to fill out 3 copies. But the State of Mass.
had locked the file so that I couldn't edit it! I ended up doing 2
screen shots and filling in the form as a BMP in Paint Shop Pro. Then
I printed the BMPs. I'm guessing that someone in state gov't just
locked the PDF without thinking, with some vague notion that official
documents are too official to leave unlocked.

2) I have a blind friend who uses text-to-speech software. He sometimes
gets PDFs that are actually made up of scanned pages, saved in the
PDF as BMPs. He can use OCR software on the BMPs, but only if
he can take them out of the PDF. If the material in question is free
to the public then there's no reason for the BMPs to be locked in.

|
| I expect something like DsoFile.DLL is what AlJones is after since the
| files he's looking at are compound docs by structure.
|
Yes, I guess that if that works it will provide an easy way to get the
"meta" file content. If it doesn't work it could be accessed directly.
Unfortunately, it looks like either method requires a dependency. DsoFile
is not a Windows system file. It would have to be added to one's software
install. Though it might be feasible to write VB code to do the same
thing. According to MS, DsoFile is just a wrapper for the IPropertyStorage
interface. There may be VB code for that around somewhere, like
at Eduardo Morcillo's website.

I just looked at a LIT file to see what the compression method is.
7-ZIP calls it a CHM. Interesting. It also says an HXS is a CHM. So I guess
CHM format is a sort of class of file. Other than 7-ZIP I have no
idea what can open a CHM. Looking at it in a hex editor, both CHM and
LIT seem to consist of a large header that lists the files inside and the
compression type. Then at a further offset is the actual compressed data.
I'm guessing it's probably similar to CAB files, which typically use either
MSZIP or LZX compression. So parsing the file from VB would be a big
project. If it were me I'd look into IPropertyStorage, in hopes of getting
a compact, dependency-free and dependable method of accessing any
"metadata" that Microsoft has written.
GS
2011-05-07 17:06:36 UTC
Permalink
Thanks again, Mayayana!
Post by Mayayana
As Helmut said, it can mean digital rights management.
About as many people translate it as digital restriction
management, since that's what it is. "Digital rights
management" is misleading marketing. DRM doesn't grant
rights or manage rights. It restricts access. I guess it's a
lot like saying pro-life and pro-choice instead of the more
accurate anti-abortion and pro-abortion. People on both
sides of that issue will argue all day about why it's incorrect
and biased to call them what they are. The propagandist
euphemisms make both sides sound more moral and noble. :)
Post by GS
My ebooks are html
packaged in a lockable EXE. I'm looking into also doing same with PDFs.
A lockable EXE? You mean like password protected?
Yes, if the ebook is for sale. Readers require an unlock key, which can
be a password or a software-type serial key. The ebook is lockable to
the machine it's installed on OR a USB device it's installed on. The
machine/device ID is captured at first startup. The reader then clicks
a link in the viewer to where they 'register' the ebook. Here, the
machine/device ID is collected and emailed to me along with the
registration info. I use the machine/device ID to generate an unlock
key that will work only for that machine/device. IOW, it's pretty much
same as licensing software.
Post by Mayayana
The LIT
file I downloaded is just a compressed file. The restriction part is
the encryption.
You're doing that with software help manuals? I'm curious: Why do
they need to be locked?
In this usage they aren't locked. I also manipulate them so as to mimic
the in-process behavior of a CHM.
Post by Mayayana
And why doesn't a CHM work?
CHMS work fine, and I do use them for some projects. CHMs, however,
don't support scripting and so I use the ebook.exe format if the
content warrants scripting.
Post by Mayayana
And...if
it's a restricted eBook, how do people read it? Do you ship your own
custom reader with your software? Then people get a password to
unlock the help...something like that?
It can include its own built-in viewer, or use Windows' IE engine.
Readers are emailed an unlock key to activate the ebook. This only
applies to ebooks for sale, and so only after I receive confirmation of
payment do I release the activation key.
Post by Mayayana
PDFs are an entirely different approach. The format is open
and available, but it's extremely complex. I once delved into
writing a PDF parser and gave up. It was just too much work.
PDFs themselves are not lockable. The restriction is just a flag
saved in the file. It only functions if the program that opens
the PDF checks the flag and acts accordingly. Most PDF software
does follow the flag's directive, but it doesn't have to. It's actually
an arbitrary crippling that could just as easily be done with any
file format. I suppose the only reason that PDF readers build in
"cripple-respect" is because Adobe started that tradition
by positioning PDF as an opaque digital object to transfer corproate
letterheads and magazine excerpts from the source to a remote
printer. In other words, PDFs (and to some extent DOCs) provided
the illusion of paper-like palpability to a business world accustomed
to being awash in official documents printed on paper. Yet the
PDF file itself actually has no restrictions. It's immutability based on
mutually agreed illusion. :)
Yes, I already work with PDFs in this context and so I'm familiar with
all its nuances. (I also do PDF ebooks for clients) What I'm hoping to
achieve is the same ability to do what I currently do with the
ebook.exe stuff. It's just a matter of including a PDF viewer in its
compile. The new IE viewer has this built-in and the IE engine supports
viewing PDFs in the browser on a new tab OR new 'web page' window. In
this case, all the security of the ebook.exe will apply to PDF-based
ebooks. Also, the product will be able to include both formats when
compiled.
Post by Mayayana
I've recompiled XPDF and SumatraPDF myself, to ignore the
restrictions flag. It only requires editing the code to not check the
flag. Both XPDF and Sumatra are OSS. I haven't made my code public
only because I wanted to respect the intentions of the OSS authors.
The problem here, to my mind, is that
a lot of people put restrictions on their PDFs for no good reason.
If I download a PDF and want to read it, I usually want to translate
it to plain text because PDF format is a pain in the neck. It's designed
for printing, not for reading. On the other hand, if I print it I usually
want to print it as plain text, too, so as not to waste ink on pointless
page decoration. In most cases there's no reason for limitations in that
regard.
1) A few years ago I was in a car accident and downloaded the report
form. It was a PDF. I had to fill out 3 copies. But the State of Mass.
had locked the file so that I couldn't edit it! I ended up doing 2
screen shots and filling in the form as a BMP in Paint Shop Pro. Then
I printed the BMPs. I'm guessing that someone in state gov't just
locked the PDF without thinking, with some vague notion that official
documents are too official to leave unlocked.
In this case I would open the PDF in my authoring software, fill in the
form fields, save and return to sender.
Post by Mayayana
2) I have a blind friend who uses text-to-speech software. He sometimes
gets PDFs that are actually made up of scanned pages, saved in the
PDF as BMPs. He can use OCR software on the BMPs, but only if
he can take them out of the PDF. If the material in question is free
to the public then there's no reason for the BMPs to be locked in.
I agree with you that restrictions should be reasonable/sensible. In my
case I don't go beyond activation restrictions, whether it be certain
pages, certain chapters, or the entire ebook. Once accessed, readers
can do whatever they want.

Also, I don't use BMPs unless it's for images that I can't convert to
GIF format in Paint Shop Pro. In all cases, I prefer GIF images whether
it's CHM, EXE, or PDF. I insist that all text be just text and not a
scanned image of a doc.

Just to give an example, I had a client who authored some manuals in MS
Word and exported the doc as PDF. The Word files were over 62mb each,
largest one being 123mb. I redid these in MS Excel, reworking the
images in PSP, and putting the text directly into the worksheet. The
total of 4 Word docs was 428mb. All 4 were duplicated inside a single
Excel file and its size was 4.25mb (including index sheet). The PDFs
created from these worksheets are less than 900kb including permissions
and bookmarks, and default display settings. My point is that there's
lots of business to be had making the 'consumer' PDF solution results
into more e-friendly products, and so my interest in having same
capability with PDFs as I do with my ebook EXEs.
Post by Mayayana
Post by GS
I expect something like DsoFile.DLL is what AlJones is after since the
files he's looking at are compound docs by structure.
Yes, I guess that if that works it will provide an easy way to get the
"meta" file content. If it doesn't work it could be accessed directly.
Unfortunately, it looks like either method requires a dependency. DsoFile
is not a Windows system file. It would have to be added to one's software
install. Though it might be feasible to write VB code to do the same
thing. According to MS, DsoFile is just a wrapper for the IPropertyStorage
interface. There may be VB code for that around somewhere, like
at Eduardo Morcillo's website.
Yes, it's a wrapper for that interface which has to be distributed with
our project. I'd love to find a VB equivalent that we can package in
our projects but haven't found anything that I'd use. I know there are
pure VB solutions out there because I've download a few for reviewing.
I might have even downloaded something from Eduardo's website. I don't
think there's anything on Brad Martinez's website.

An alternative to DsoFile is Dan Appleman's dwProp.dll (not free). It
has one drawback in that it classes the 'Keywords' property as a
DocumentSummaryProperty rather than a file SummaryProperty. I bought
this to obviate the issue that in certain cases DsoFile will not write
the 'Title' prop if there's no other props populated with values. This
is a major concern for my app that uses it since 'Title' is the only
way to tell what the file contains because the filenames are
alpha-numeric and may/may not have a file extension.

The drawback with dwProp is (as explained above) that it won't write
the Title prop to plain text files on NTFS volumes that support the
NTFS FileSummaryProperties. As of Win6 this is no longer supported for
non-compound file structures such as TXTs or the like.

DsoFile does handle the Keywords property correctly, though it should
be noted that (as of Win6) in a compound structure file this property
is named "Tag" on the 'Details' tab of the property dialog.
Post by Mayayana
I just looked at a LIT file to see what the compression method is.
7-ZIP calls it a CHM. Interesting. It also says an HXS is a CHM. So I guess
CHM format is a sort of class of file. Other than 7-ZIP I have no
idea what can open a CHM. Looking at it in a hex editor, both CHM and
LIT seem to consist of a large header that lists the files inside and the
compression type. Then at a further offset is the actual compressed data.
I'm guessing it's probably similar to CAB files, which typically use either
MSZIP or LZX compression. So parsing the file from VB would be a big
project. If it were me I'd look into IPropertyStorage, in hopes of getting
a compact, dependency-free and dependable method of accessing any
"metadata" that Microsoft has written.
I'd be interested to see if my conversion app can process a LIT same as
a CHM. Any ideas where I can get a sample? Or could I just do a Search
on my own machine?
--
Garry

Free usenet access at http://www.eternal-september.org
ClassicVB Users Regroup! comp.lang.basic.visual.misc
GS
2011-05-07 17:14:24 UTC
Permalink
Post by GS
The drawback with dwProp is (as explained above)
Oops! The abvoe should read...

The drawback with DsoFile is (as explained above)
--
Garry

Free usenet access at http://www.eternal-september.org
ClassicVB Users Regroup! comp.lang.basic.visual.misc
Mayayana
2011-05-07 17:34:04 UTC
Permalink
| My point is that there's
| lots of business to be had making the 'consumer' PDF solution results
| into more e-friendly products, and so my interest in having same
| capability with PDFs as I do with my ebook EXEs.
|

Thanks for all that explanation. I don't know much about
eBooks and had never thought of the idea of doing something
like that as a business.

With the PDFs: I don't know if it's of any help, but mupdf
is what Sumatra uses. It's an OSS reader. I don't know anything
about the API for that, but if you're comfortable with a bit of C++
you can download the sort-of free C++ Express 2010 and build/edit
the project yourself.

| I'd be interested to see if my conversion app can process a LIT same as
| a CHM. Any ideas where I can get a sample? Or could I just do a Search
| on my own machine?
|
I found some here:

http://diskbooks.org/zobord.html

It seems to be some sort of Born-Again Christian site. I just
found it by seraching Google. The downloads for Windows
are SFX ZIP files with RTF and LIT versions inside.
Loading...