We originally used to read every node and allocate a jffs2_tmp_dnode_info
structure for each, before processing them in (reverse) version order
and discarding the ones which are obsoleted by later nodes.
With huge logfiles, this behaviour caused memory problems. For example, a
file involved in OLPC trac #1292 has 1822391 nodes, and would cause the XO
machine to run out of memory during the first stage of read_inode().
Instead of just inserting nodes into a tree in version order as we find
them, we now put them into a tree in order of their offset within the
file, which allows us to immediately discard nodes which are completely
obsoleted.
We don't use a full tree with 'fragments' pointing to the real data
structure, as we do in the normal fragtree. We sort only on the start
address, and add an 'overlapped' flag to the tmp_dnode_info to indicate
that the node in question is (partially) overlapped by another.
When the scan is complete, we start at the end of the file, adding each
node to a real fragtree as before. Where the node is non-overlapped, we
just add it (it doesn't matter that it's not the latest version; there is
no overlap). When the node at the end of the tree _is_ overlapped, we sort
it and all its overlapping nodes into version order and then add them to
the fragtree in that order.
This 'early discard' reduces the peak allocation of tmp_dnode_info
structures from 1.8M to a mere 62872 (3.5%) in the degenerate case
referenced above.
This version of the patch also correctly rememembers the highest node
version# seen for an inode when it's scanned.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
In read inode we have an optimization which prevents one
min. I/O unit (e.g. NAND page) to be read more then once.
Namely, at the beginning we do not know which node type we read,
so we read so we assume we read the directory entry, because it
has the smallest node header. When we read it, we read up to the
next min. I/O unit, just because if later we'll need to read more,
we already have this data.
If it turns out to be that the node is not directory entry, and
we need more data, and we did not read it because it sits in the
next min. I/O unit, we read the whole next (or several next)
min. I/O unit(s). And if it happens to be that we read a data node,
and we've read part of its data, we calculate partial CRC.
So if later we need to check data CRC, we'll only read the rest
of the data from further min. I/O units and continue CRC checking.
This code was a bit messy and buggy. The bug was that it assumed
relatively large min. I/O unit, so that the largest node header
could overlap only one min. I/O unit boundary.
This parch clean-ups the code a bit and fixes this bug.
The patch was not tested on flash with small min. I/O unit, like
NOR-ECC, nut it was tested on NAND with 512 bytes NAND page, so
it at least does not break NAND. It was also tested with mtdram
so it should not break NOR.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Due to a poor choice of CRC32 seed, a node header which is all zeroes
would pass the CRC32 check. Explicitly check for this case, and treat it
as we do a CRC failure.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Replace kmalloc+memset with kzalloc
Signed-off-by: Yan Burman <burman.yan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
jffs2_clear_acl() which releases acl caches allocated by kmalloc()
was defined but it was never called. Thus, we faced to the risk
of memory leaking.
This patch plugs jffs2_clear_acl() into jffs2_do_clear_inode().
It ensures to release acl cache when inode is cleared.
Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
If xattr_ref is associated with an orphan inode_cache
on filesystem mounting, those xattr_refs are not
released even if this inode_cache is released.
This patch enables to call jffs2_xattr_delete_inode()
for such a irregular inode_cachde too.
Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* git://git.infradead.org/~dwmw2/rbtree-2.6:
[RBTREE] Switch rb_colour() et al to en_US spelling of 'color' for consistency
Update UML kernel/physmem.c to use rb_parent() accessor macro
[RBTREE] Update hrtimers to use rb_parent() accessor macro.
[RBTREE] Add explicit alignment to sizeof(long) for struct rb_node.
[RBTREE] Merge colour and parent fields of struct rb_node.
[RBTREE] Remove dead code in rb_erase()
[RBTREE] Update JFFS2 to use rb_parent() accessor macro.
[RBTREE] Update eventpoll.c to use rb_parent() accessor macro.
[RBTREE] Update key.c to use rb_parent() accessor macro.
[RBTREE] Update ext3 to use rb_parent() accessor macro.
[RBTREE] Change rbtree off-tree marking in I/O schedulers.
[RBTREE] Add accessor macros for colour and parent fields of rb_node
Also, make sure dirents are marked REF_UNCHECKED when we 'discover' them
through eraseblock summary.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Especially when summary code is used, we can have in-memory data
structures referencing certain nodes without them actually being readable
on the flash. Discard the nodes gracefully in that case, rather than
triggering a BUG().
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Fix some bugs in mtd/jffs2 on 64bit platform.
The MEMGETBADBLOCK/MEMSETBADBLOCK ioctl are not listed in compat_ioctl.h.
And some variables in jffs2 are declared as uint32_t but used to hold
size_t values.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Cc: Thomas Gleixner <tglx@linutronix.de>
Acked-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Simplify the debugging code further.
Update the TODO list
Signed-off-by: Artem B. Bityutskiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
When data starts from the beginning of NAND page, 'len' must be zero, not
c->wbuf_page.
Thanks to Zoltan Sogor for reporting this problem.
Signed-off-by: Artem B. Bityutskiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
From: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Artem B. Bityutskiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Instead of building fragtree starting from node with the smallest version
number, start from the highest. This helps to avoid reading and checking
obsolete nodes.
Signed-off-by: Artem B. Bityutskiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Replace the D1(printk()) style debugging with the new debug macros
Signed-off-by: Artem B. Bityutskiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Move functions to read inodes into readinode.c
Move functions to handle fragtree and dentry lists into nodelist.[ch]
Signed-off-by: Artem B. Bityutskiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Rename functions to a name matching the functionality.
Remove stall debug code
Signed-off-by: Artem B. Bityutskiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Various simplifiactions. printk format corrections.
Convert more code to use the new debug functions.
Signed-off-by: Artem B. Bityutskiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
If debugging is disabled, define debugging functions as empty macros, instead
of using Dx() explicitly.
Signed-off-by: Artem B. Bityutskiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
JFFS2 uses f->dents to store the pointer to the symlink target string (in case
the inode is symlink). This is somewhat ugly to use the same field for
different reasons. Introduce distinct field f->target for this purpose.
Note, f->fragtree, f->dents, f->target may probably be put in a union.
Signed-off-by: Artem B. Bityutskiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Move debug functions into a seperate source file
Signed-off-by: Artem B. Bityutskiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Use an rbtree instead of a simple linked list. We were wasting
an amazing amount of time in jffs2_add_tn_to_list().
Thanks to Artem Bityuckiy and Jarkko Jlavinen for noticing.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Don't remove inocache for inodes which are in read_inode() or
clear_inode() until they're done.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.
Let it rip!