Date: Thu, 28 Jan 2010 11:23:05 +0530
From: Sharyathi Nagesh <sharyath@in.ibm.com>
To: linux-numa@vger.kernel.org, Andi Kleen <andi@firstfloor.org>,
	Christoph Lameter <clameter@sgi.com>, Cliff Wickman <cpw@sgi.com>,
	Lee Schermerhorn <lee.schermerhorn@hp.com>,
	Amit K Arora <amitarora@in.ibm.com>, deepti.kalra@in.ibm.com
Subject: Fix to numa_node_to_cpus_v2

Hi

We observed that numa_node_to_cpus api() api converts a node number to a 
bitmask of CPUs. The user must pass a long enough buffer. If the buffer is not
long enough errno will be set to ERANGE and -1 returned. On success 0 is returned.
This api has been changed in numa version 2.0. It has new implementation (_v2)

Analysis:
Now within the numa_node_to_cpus code there is a check if the size of buffer
passed from the user matches the one returned by the sched_getaffinity. This
check fails and hence we see "map size mismatch: abort" messages coming out on
console. My system has 4 node and 8 CPUs.

Testcase to reproduce the problem:
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <numa.h>

typedef unsigned long BUF[64];

int numa_exit_on_error = 0;

void node_to_cpus(void)
{
         int i;
         BUF cpubuf;
         BUF affinityCPUs;
         int maxnode = numa_max_node();
         printf("available: %d nodes (0-%d)\n", 1+maxnode, maxnode);
         for (i = 0; i <= maxnode; i++) {
                 printf("Calling numa_node_to_cpus()\n");
                 printf("Size of BUF is : %d \n",sizeof(BUF));
                 if ( 0 == numa_node_to_cpus(i, cpubuf, sizeof(BUF)) ) {
                         printf("Calling numa_node_to_cpus() again \n");
                         if ( 0 == numa_node_to_cpus(i, cpubuf, sizeof(BUF)) ) {
                         } else {
                                 printf("Got < 0 \n");
                                 numa_error("numa_node_to_cpu");
                                 numa_exit_on_error = 1;
                                 exit(numa_exit_on_error);
                         }
                 } else {
                         numa_error("numa_node_to_cpu 0");
                         numa_exit_on_error = 1;
                         exit(numa_exit_on_error);
                 }
         }
}
int main()
{
         void node_to_cpus();
         if (numa_available() < 0)
         {
             printf("This system does not support NUMA policy\n");
             numa_error("numa_available");
             numa_exit_on_error = 1;
             exit(numa_exit_on_error);
         }
         node_to_cpus();
         return numa_exit_on_error;
}
--------------------------------------------------------------------------------

Problem Fix:
The fix is to allow numa_node_to_cpus_v2() to fail only when the supplied 
buffer is smaller than the bitmask required to represent online NUMA nodes.
Attaching the patch to address this issues, patch is generated against numactl-2.0.4-rc1

Regards
Yeehaw

---
 libnuma.c |    7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

Index: numactl-dev/libnuma.c
===================================================================
--- numactl-dev.orig/libnuma.c
+++ numactl-dev/libnuma.c
@@ -1016,9 +1016,6 @@ copy_bitmask_to_bitmask(struct bitmask *
 		memcpy(bmpto->maskp, bmpfrom->maskp, bytes);
 		memset(((char *)bmpto->maskp)+bytes, 0,
 					CPU_BYTES(bmpto->size)-bytes);
-	} else {
-		bytes = CPU_BYTES(bmpfrom->size);
-		memcpy(bmpto->maskp, bmpfrom->maskp, bytes);
 	}
 }
 
@@ -1272,11 +1269,11 @@ numa_node_to_cpus_v2(int node, struct bi
 
 	if (node_cpu_mask_v2[node]) {
 		/* have already constructed a mask for this node */
-		if (buffer->size != node_cpu_mask_v2[node]->size) {
+		if (buffer->size < node_cpu_mask_v2[node]->size) {
 			numa_error("map size mismatch; abort\n");
 			return -1;
 		}
-		memcpy(buffer->maskp, node_cpu_mask_v2[node]->maskp, bufferlen);
+		copy_bitmask_to_bitmask(node_cpu_mask_v2[node], buffer);
 		return 0;
 	}
 
