kernel-ark/arch/arm/mm
Ming Lei 81f28946a8 ARM: 7746/1: mm: lazy cache flushing on non-mapped pages
Currently flush_dcache_page() thinks pages as non-mapped if
mapping_mapped(mapping) return false. This approach is very
coase:
	- mmap on part of file may cause all pages backed on
	the file being thought as mmaped

	- file-backed pages aren't mapped into user space actually
	if the memory mmaped on the file isn't accessed

This patch uses page_mapped() to decide if the page has been
mapped.

From the attached test code, I find there is much performance
improvement(>25%) when accessing page caches via read under this
situations, so memcpy benefits a lot from not flushing cache
under this situation.

No.   read time without the patch	No. read time with the patch
================================================================
No. 0, time  22615636 us		No. 0, time  22014717 us
No. 1, time  4387851 us 		No. 1, time  3113184 us
No. 2, time  4276535 us 		No. 2, time  3005244 us
No. 3, time  4259821 us 		No. 3, time  3001565 us
No. 4, time  4263811 us 		No. 4, time  3002748 us
No. 5, time  4258486 us 		No. 5, time  3004104 us
No. 6, time  4253009 us 		No. 6, time  3002188 us
No. 7, time  4262809 us 		No. 7, time  2998196 us
No. 8, time  4264525 us 		No. 8, time  3007255 us
No. 9, time  4267795 us 		No. 9, time  3005094 us

1), No.0. is to read the file from storage device, and others are
to read the file from page caches basically.
2), file size is 512M, and is on ext4 over usb mass storage.
3), the test is done on Pandaboard.

unsigned int  sum = 0;
unsigned long sum_val = 0;

static unsigned long tv_diff(struct timeval *tv1, struct timeval *tv2)
{
	return (tv2->tv_sec - tv1->tv_sec) * 1000000 +
		(tv2->tv_usec - tv1->tv_usec);
}

int main(int argc, char *argv[])
{
	char *mbuf, fbuf;
	int fd;
	int i;
	unsigned long page_size, size;
	struct stat stat;
	struct timeval t1, t2;
	unsigned char *rbuf = malloc(32 * page_size);

	if (!rbuf) {
		printf("	%sn", "malloc failed");
		exit(-1);
	}

	page_size = getpagesize();
	fd = open(argv[1], O_RDWR);
	assert(fd >= 0);

	fstat(fd, &stat);
	size = stat.st_size;
	printf("%s: file %s, size %lu, page size %lun",
		argv[0],
		argv[1], size, page_size);

	gettimeofday(&t1, NULL);
	mbuf = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
	if (!mbuf) {
		printf("	%sn", "mmap failed");
		exit(-1);
	}

	for (i = 0 ; i < size ; i += (page_size * 32)) {
		int rcnt;
		lseek(fd, i, SEEK_SET);
		rcnt = read(fd, rbuf, page_size * 32);
		if (rcnt != page_size * 32) {
			printf("%s: read faildn", __func__);
			exit(-1);
		}
	}
	free(rbuf);
	munmap(mbuf, size);
	gettimeofday(&t2, NULL);
	printf("tread mmaped time: %luusn", tv_diff(&t1, &t2));

	close(fd);
}

Cc: Michel Lespinasse <walken@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-06-05 23:37:32 +01:00
..
abort-ev4.S
abort-ev4t.S
abort-ev5t.S
abort-ev5tj.S
abort-ev6.S ARM: 7396/1: errata: only handle ARM erratum #326103 on affected cores 2012-04-23 14:21:52 +01:00
abort-ev7.S
abort-lv4t.S
abort-macro.S ARM: 7088/1: entry: fix wrong parameter name used in do_thumb_abort 2011-09-10 23:39:56 +01:00
abort-nommu.S
alignment.c Merge branch 'for-next' of git://git.pengutronix.de/git/ukl/linux into devel-stable 2013-03-09 15:49:32 +00:00
cache-aurora-l2.h ARM: 7547/4: cache-l2x0: add support for Aurora L2 cache ctrl 2012-11-06 19:47:35 +00:00
cache-fa.S ARM: mm: implement LoUIS API for cache maintenance ops 2012-09-25 11:20:25 +01:00
cache-feroceon-l2.c ARM: 7696/1: Fix kexec by setting outer_cache.inv_all for Feroceon 2013-04-17 16:53:27 +01:00
cache-l2x0.c ARM: 7716/1: bcm281xx: Add L2 support for Rev A2 chips 2013-05-15 19:39:27 +01:00
cache-tauros2.c ARM: cache: add dt support for tauros2 cache 2012-08-16 16:16:50 +08:00
cache-v4.S ARM: mm: remove broken condition check for v4 flushing 2013-03-26 09:55:34 +00:00
cache-v4wb.S ARM: mm: implement LoUIS API for cache maintenance ops 2012-09-25 11:20:25 +01:00
cache-v4wt.S ARM: mm: implement LoUIS API for cache maintenance ops 2012-09-25 11:20:25 +01:00
cache-v6.S ARM: mm: implement LoUIS API for cache maintenance ops 2012-09-25 11:20:25 +01:00
cache-v7.S arm: Add v7_invalidate_l1 to cache-v7.S 2013-02-11 19:37:24 -08:00
cache-xsc3l2.c ARM: move CP15 definitions to separate header file 2012-03-28 18:30:01 +01:00
context.c ARM: 7684/1: errata: Workaround for Cortex-A15 erratum 798181 (TLBI/DSB operations) 2013-04-03 16:45:49 +01:00
copypage-fa.c arm: remove the second argument of k[un]map_atomic() 2012-03-20 21:48:14 +08:00
copypage-feroceon.c arm: remove the second argument of k[un]map_atomic() 2012-03-20 21:48:14 +08:00
copypage-v4mc.c Merge branch 'for-linus' of git://git.linaro.org/people/rmk/linux-arm 2012-03-29 16:53:48 -07:00
copypage-v4wb.c arm: remove the second argument of k[un]map_atomic() 2012-03-20 21:48:14 +08:00
copypage-v4wt.c arm: remove the second argument of k[un]map_atomic() 2012-03-20 21:48:14 +08:00
copypage-v6.c Merge branch 'for-linus' of git://git.linaro.org/people/rmk/linux-arm 2012-03-29 16:53:48 -07:00
copypage-xsc3.c arm: remove the second argument of k[un]map_atomic() 2012-03-20 21:48:14 +08:00
copypage-xscale.c Merge branch 'for-linus' of git://git.linaro.org/people/rmk/linux-arm 2012-03-29 16:53:48 -07:00
dma-mapping.c ARM: 7730/1: DMA-mapping: mark all !DMA_TO_DEVICE pages in unmapping as clean 2013-05-23 00:09:45 +01:00
extable.c
fault-armv.c mm: replace vma prio_tree with an interval tree 2012-10-09 16:22:39 +09:00
fault.c readahead: fault retry breaks mmap file read random detection 2012-10-09 16:22:47 +09:00
fault.h ARM: LPAE: Add fault handling support 2011-12-08 10:30:40 +00:00
flush.c ARM: 7746/1: mm: lazy cache flushing on non-mapped pages 2013-06-05 23:37:32 +01:00
fsr-2level.c ARM: LPAE: Move the FSR definitions to separate files 2011-12-08 10:30:37 +00:00
fsr-3level.c ARM: LPAE: Add fault handling support 2011-12-08 10:30:40 +00:00
highmem.c Merge branch 'for-linus' of git://git.linaro.org/people/rmk/linux-arm 2012-03-29 16:53:48 -07:00
idmap.c ARM: KVM: move to a KVM provided HYP idmap 2013-04-28 22:23:08 -07:00
init.c mm/ARM: use free_highmem_page() to free highmem pages into buddy system 2013-04-29 15:54:31 -07:00
iomap.c arm/PCI: remove arch pci_flags definition 2012-02-23 20:18:56 -07:00
ioremap.c ARM: 7728/1: mm: Use phys_addr_t properly for ioremap functions 2013-05-23 00:09:44 +01:00
Kconfig Merge branches 'devel-stable', 'entry', 'fixes', 'mach-types', 'misc' and 'smp-hotplug' into for-linus 2013-05-02 21:30:36 +01:00
Makefile ARM: cache: remove ARMv3 support code 2013-03-26 09:55:23 +00:00
mm.h ARM: 7645/1: ioremap: introduce an infrastructure for static mapped area 2013-02-16 17:54:22 +00:00
mmap.c Merge branch 'akpm' (Andrew's patchbomb) 2012-12-11 18:05:37 -08:00
mmu.c Merge branches 'devel-stable', 'entry', 'fixes', 'mach-types', 'misc' and 'smp-hotplug' into for-linus 2013-05-02 21:30:36 +01:00
nommu.c ARM: 7728/1: mm: Use phys_addr_t properly for ioremap functions 2013-05-23 00:09:44 +01:00
pabort-legacy.S
pabort-v6.S
pabort-v7.S
pgd.c ARM: move CP15 definitions to separate header file 2012-03-28 18:30:01 +01:00
proc-arm7tdmi.S ARM: proc-*.S: place cpu_reset functions into .idmap.text section 2011-12-06 14:04:14 +00:00
proc-arm9tdmi.S ARM: proc-*.S: place cpu_reset functions into .idmap.text section 2011-12-06 14:04:14 +00:00
proc-arm720.S ARM: proc-*.S: place cpu_reset functions into .idmap.text section 2011-12-06 14:04:14 +00:00
proc-arm740.S ARM: mm: fix numerous hideous errors in proc-arm740.S 2013-03-26 09:55:33 +00:00
proc-arm920.S ARM: Do 15e0d9e37c (ARM: pm: let platforms select cpu_suspend support) properly 2013-04-08 12:00:38 +01:00
proc-arm922.S ARM: mm: implement LoUIS API for cache maintenance ops 2012-09-25 11:20:25 +01:00
proc-arm925.S ARM: mm: implement LoUIS API for cache maintenance ops 2012-09-25 11:20:25 +01:00
proc-arm926.S ARM: Do 15e0d9e37c (ARM: pm: let platforms select cpu_suspend support) properly 2013-04-08 12:00:38 +01:00
proc-arm940.S ARM: mm: implement LoUIS API for cache maintenance ops 2012-09-25 11:20:25 +01:00
proc-arm946.S ARM: mm: implement LoUIS API for cache maintenance ops 2012-09-25 11:20:25 +01:00
proc-arm1020.S ARM: mm: implement LoUIS API for cache maintenance ops 2012-09-25 11:20:25 +01:00
proc-arm1020e.S ARM: mm: implement LoUIS API for cache maintenance ops 2012-09-25 11:20:25 +01:00
proc-arm1022.S ARM: mm: implement LoUIS API for cache maintenance ops 2012-09-25 11:20:25 +01:00
proc-arm1026.S ARM: mm: implement LoUIS API for cache maintenance ops 2012-09-25 11:20:25 +01:00
proc-fa526.S Disintegrate asm/system.h for ARM 2012-03-28 18:30:01 +01:00
proc-feroceon.S ARM: 7542/1: mm: fix cache LoUIS API for xscale and feroceon 2012-09-28 21:09:50 +01:00
proc-macros.S ARM: 7649/1: mm: mm->context.id fix for big-endian 2013-02-16 17:54:26 +00:00
proc-mohawk.S ARM: Do 15e0d9e37c (ARM: pm: let platforms select cpu_suspend support) properly 2013-04-08 12:00:38 +01:00
proc-sa110.S ARM: proc-*.S: place cpu_reset functions into .idmap.text section 2011-12-06 14:04:14 +00:00
proc-sa1100.S ARM: Do 15e0d9e37c (ARM: pm: let platforms select cpu_suspend support) properly 2013-04-08 12:00:38 +01:00
proc-syms.c ARM: modules: don't export cpu_set_pte_ext when !MMU 2013-03-26 09:55:34 +00:00
proc-v6.S Merge branches 'devel-stable', 'entry', 'fixes', 'mach-types', 'misc' and 'smp-hotplug' into for-linus 2013-05-02 21:30:36 +01:00
proc-v7-2level.S ARM: 7691/1: mm: kill unused TLB_CAN_READ_FROM_L1_CACHE and use ALT_SMP instead 2013-04-03 17:39:07 +01:00
proc-v7-3level.S ARM: 7691/1: mm: kill unused TLB_CAN_READ_FROM_L1_CACHE and use ALT_SMP instead 2013-04-03 17:39:07 +01:00
proc-v7.S Merge branches 'devel-stable', 'entry', 'fixes', 'mach-types', 'misc' and 'smp-hotplug' into for-linus 2013-05-02 21:30:36 +01:00
proc-xsc3.S ARM: Do 15e0d9e37c (ARM: pm: let platforms select cpu_suspend support) properly 2013-04-08 12:00:38 +01:00
proc-xscale.S ARM: Do 15e0d9e37c (ARM: pm: let platforms select cpu_suspend support) properly 2013-04-08 12:00:38 +01:00
tcm.h ARM: 7694/1: ARM, TCM: initialize TCM in paging_init(), instead of setup_arch() 2013-04-17 16:53:24 +01:00
tlb-fa.S Merge branch 'devel-stable' into for-next 2011-07-22 23:09:07 +01:00
tlb-v4.S
tlb-v4wb.S ARM: mm: tlb-v4wb: Use the new processor struct macros 2011-07-07 15:31:12 +01:00
tlb-v4wbi.S ARM: mm: tlb-v4wbi: Use the new processor struct macros 2011-07-07 15:31:12 +01:00
tlb-v6.S Merge branch 'devel-stable' into for-next 2011-07-22 23:09:07 +01:00
tlb-v7.S ARM: 7489/1: errata: fix workaround for erratum #720789 on UP systems 2012-08-11 09:16:00 +01:00